Data anonymous of Test data (English version)



ANONYMISING OF TEST DATA WITHIN A TEST ENVIRONMENT


Author:         Eric Spaargaren
Date:            20-06-2019
Review:        28-06-2019
Published:    July 2019
Version:        1.2



Introduction



This article gives an explanation about which impact “Data Anonymous” has concerning organisations. This means that this document explains in a short way what the AVG-legislation is about, which consequences this has for organisations, which needs are to be done by “data anonymous” and which controls will be done after “data anonymous”.




AVG-legislation


AVG stands for “Algemene Verordening Gegevensbescherming, this is a new European law. This concerns avoiding the privacy violation about for example information on your laptop or customer information at a company. Different analyses can be made to make a decision about where the risks are within a company. It could be possible that at a certain location in the company are no personal data stored at all, this could minimize a large investment. Personal data are all types of data which can provide information about a natural person like: surname, last name, address, house number, postal code, place of living and phone number. This is information which can be anonymous tested safely in a test environment.

Data anonymous in a project

To start data anonymous as a project within a company coming up with a correct plan with all types of steps will be necessary. It’s extremely important that there is a sharp internal communication within the company. With data anonymous it is possible to use an SQL-script to anonymize the database. It’s also possible to use tooling concerning the data anonymous or the tooling is part of the certain software application. A part of the test could be what kind of actions script is doing. This means: which columns the script will pick-up to start the anonymous action. When you could determine this as preparation then you will know, after data anonymous, which data in the database are picked up with new values. After this action a validation of the data could be a next action.

After this validation action it is very important perfect agreements about which environments of the “OTA-street” will be made and must be anonymous yes or no. In all situations it is not allowed to do actions with data anonymous within the Production environment because it is especially the purpose to these data anonymous within the test-environment. Before data-anonymous will take place it is important to make a copy from the P-environment or T-environment first. Please take care that the copy of the database will be stored on a secure file-location, so that the DBA-er can find the copy of the database back in a quick and easy way.  

As mentioned before determination is needed which columns in the table must be data anonymous. This can be done in the following different ways:
  • 1.     In the table environment of the specific application,
  • 2.     With the software tool which takes action concerning the data anonymous,
  • 3.     Within the SQL-script,
  • 4.     In the table environment of the back-office application.
 In the 2 tables below is given which tables can be data anonymous:

Example of personal data within the table “personal data”.:
Item
Table name_Column_name
Data-anonimous
Diacritic
Surname
Personal data_surname
Unrecognizable
No
Lastname
Personal data_lastname
Unrecognizable
No
Streetname
Personal data_streetname
Unrecognizable
Yes
Streetnumber
Personal data_streetnumber
Recognizable
Yes
Postal code
Personal data_postal code
Unrecognizable
Yes
Residence
Personal data_residence
Unrecognizable
No
Telephonenumber
Personal data_telephonenumber  
Unrecognzable
No


Example of personal data within the table “Insurance data”:
Item
Tabelnaam_Kolomnaam
Data-anonimous
Diacritic
Car label
Insurancedatacar_label
Unrecognzable
No
Car model
Insurancedatacar_model
Unrecognzable
Yes
Driving mechanism
Insurancedatacar_drive
Recognizable
No
Engine content
Insurancedatacar_content
Recognizable
Yes
Electricpower
Insurancedatacar_ah
Unrecognzable
Yes
Color
Insurancedatacar_color
Recognizable
No
Extratype
Insurancedatacar_extratype  
Unrecognzable
Yes

SQL-script

Before data-anonymous will take place it is necessary to do a validation of the SQL-script. Because it could be possible that certain data have diacritics. These diacritics could give errors when running the software application. An alignment has to be done with the software development team with the question: “How will the data take place anonymously”. What will be data anonymous? Which data fields within the tables will be data anonymous and which content? When this is clear and written down it’s possible to start with data anonymous.



The execution of data-anonymous

A lot of things will happen during the process of data-anonymous. It could be possible there are a few users (like application managers) or testers will use the content of the software application so these employees must be well informed before the action of data anonymous will take place. Because during the process of data anonymous it is not possible to use the content of the database at that moment. Companies could also choose to put an anonymous database in between production and test to minimize the disturbance of the anonymous process. 
During this process it’s important to have a good contact with the DBA’er (database administrator) who is performing the execution part. It’s handy that the anonymous process must be done together, it’s a corporate business process.



The content check after Data Anonymous

During the check after data anonymous it is necessary that  the copy of the database can still  be found back on the right location. This is essential because it could be possible that after the process of Data Anonymous the software application will be not working anymore or that it is possible a couple of functions of the application are not working anymore. In case this is the situation it is possible to put  the copy of the database back. It is also possible that the copy of the database can be used to execute functional regression tests because a functional delivery needs to be done. After Data Anonymous a check will be done by the DBA’er in the database whether the data has been changed (anonymous). A check about the content on surname, last name, address, house number, postal code and residence needs to be performed. After all important items have been updated the data anonymous process has been executed correctly. Notice: it is possible that after this data anonymous there are still diacritics within the fields of the database. In case of the situation that diacritics are still in the database it’s possible to execute a functional test to test whether the functional items are still working correctly. When this is not the case it’s possible to make the decision to put the copy of the database back. Beside this action it’s possible to make a small modification in the SQL script so the diacritics will not be read out in the in software application and not attached by using the functional items in the application.



Summery

We can say that companies will hold their promises to following the AVG-legislation. The implementation of this legislation is often not an easy assignment.

It is possible to execute an impact analyses and a risk analyses within a business company to make an investigation which software systems have personal data. This process will be followed by checking which processes are hit by data anonymous. Then it could be possible which types of data can be included for this data anonymous process.

Perfect agreements must be made with several users with different roles of the application to get the right results after the data anonymous process. A tester will get other authorisations for the application than for example an end user. A developer or functional manager will have more rights. For example to solve production issues. To manage all these types of roles it is necessary to have internal coordination within the IT-department as within Human Resources.

Furthermore it is necessary to determine which fields or tables in the database will be ready for the data anonymous process. Some examples of data anonymous processes items are: empty fields, suffle or reproduction of data. There are existing tools or SQL-scripts to execute these kind processes.

After data anonymous within the test environment it is necessary to do functional tests to check whether all important functions work well. It is necessary to check if data corruption will arise after data anonymous. The subject of diacritics has his own important characteristics with this important process of data anonymous.

Geen opmerkingen:

Een reactie posten

Test Consultancy (ENG)

The IT career of Eric started in 1995 with programming followed by testing,  advanced testing and  the training of junior testers and users ...