ANONYMISING OF TEST DATA WITHIN A TEST ENVIRONMENT
Author: Eric Spaargaren
Date: 20-06-2019
Review: 28-06-2019
Published: July 2019
Version: 1.2
Date: 20-06-2019
Review: 28-06-2019
Published: July 2019
Version: 1.2
Introduction
This article gives an explanation about which impact “Data
Anonymous” has concerning organisations. This means that this document explains
in a short way what the AVG-legislation is about, which consequences this has
for organisations, which needs are to be done by “data anonymous” and which controls
will be done after “data anonymous”.
AVG-legislation
AVG stands for “Algemene Verordening
Gegevensbescherming, this is a new European law. This concerns avoiding the
privacy violation about for example information on your laptop or customer
information at a company. Different analyses can be made to make a decision
about where the risks are within a company. It could be possible that at a
certain location in the company are no personal data stored at all, this could
minimize a large investment. Personal data are all types of data which can
provide information about a natural person like: surname, last name, address,
house number, postal code, place of living and phone number. This is
information which can be anonymous tested safely in a test environment.
Data anonymous in a project
To start data anonymous as a project within a company
coming up with a correct plan with all types of steps will be necessary. It’s extremely
important that there is a sharp internal communication within the company. With
data anonymous it is possible to use an SQL-script to anonymize the database. It’s
also possible to use tooling concerning the data anonymous or the tooling is
part of the certain software application. A part of the test could be what kind
of actions script is doing. This means: which columns the script will pick-up
to start the anonymous action. When you could determine this as preparation
then you will know, after data anonymous, which data in the database are picked
up with new values. After this action a validation of the data could be a next
action.
After this validation action it is very important perfect agreements
about which environments of the “OTA-street” will be made and must be anonymous
yes or no. In all situations it is not allowed to do actions with data
anonymous within the Production environment because it is especially the
purpose to these data anonymous within the test-environment. Before data-anonymous
will take place it is important to make a copy from the P-environment or
T-environment first. Please take care that the copy of the database will be
stored on a secure file-location, so that the DBA-er can find the copy of the
database back in a quick and easy way.
As mentioned before determination
is needed which columns in the table must be data anonymous. This can be done
in the following different ways:
- 1. In the table environment of the specific application,
- 2. With the software tool which takes action concerning the data anonymous,
- 3. Within the SQL-script,
- 4. In the table environment of the back-office application.
In the 2 tables below is given which tables can be data anonymous:
Example of personal data within the table “personal data”.:
Item
|
Table name_Column_name
|
Data-anonimous
|
Diacritic
|
Surname
|
Personal data_surname
|
Unrecognizable
|
No
|
Lastname
|
Personal data_lastname
|
Unrecognizable
|
No
|
Streetname
|
Personal data_streetname
|
Unrecognizable
|
Yes
|
Streetnumber
|
Personal data_streetnumber
|
Recognizable
|
Yes
|
Postal code
|
Personal data_postal
code
|
Unrecognizable
|
Yes
|
Residence
|
Personal data_residence
|
Unrecognizable
|
No
|
Telephonenumber
|
Personal data_telephonenumber
|
Unrecognzable
|
No
|
Example of personal data within the table “Insurance data”:
Item
|
Tabelnaam_Kolomnaam
|
Data-anonimous
|
Diacritic
|
Car label
|
Insurancedatacar_label
|
Unrecognzable
|
No
|
Car model
|
Insurancedatacar_model
|
Unrecognzable
|
Yes
|
Driving
mechanism
|
Insurancedatacar_drive
|
Recognizable
|
No
|
Engine content
|
Insurancedatacar_content
|
Recognizable
|
Yes
|
Electricpower
|
Insurancedatacar_ah
|
Unrecognzable
|
Yes
|
Color
|
Insurancedatacar_color
|
Recognizable
|
No
|
Extratype
|
Insurancedatacar_extratype
|
Unrecognzable
|
Yes
|
SQL-script
Before data-anonymous will take place it is necessary
to do a validation of the SQL-script. Because it could be possible that certain
data have diacritics. These diacritics could give errors when running the
software application. An alignment has to be done with the software development
team with the question: “How will the data take place anonymously”. What will
be data anonymous? Which data fields within the tables will be data anonymous
and which content? When this is clear and written down it’s possible to start
with data anonymous.
A lot of things will happen during the process of
data-anonymous. It could be possible there are a few users (like application managers)
or testers will use the content of the software application so these employees
must be well informed before the action of data anonymous will take place.
Because during the process of data anonymous it is not possible to use the
content of the database at that moment. Companies could also choose to put an
anonymous database in between production and test to minimize the disturbance
of the anonymous process.
During this process it’s important to have a good contact with the DBA’er (database administrator) who is performing the execution part. It’s handy that the anonymous process must be done together, it’s a corporate business process.
During this process it’s important to have a good contact with the DBA’er (database administrator) who is performing the execution part. It’s handy that the anonymous process must be done together, it’s a corporate business process.
The content check after Data Anonymous
During the check after data anonymous it is necessary
that the copy of the database can still be found back on the right location. This is
essential because it could be possible that after the process of Data Anonymous
the software application will be not working anymore or that it is possible a
couple of functions of the application are not working anymore. In case this is
the situation it is possible to put the
copy of the database back. It is also possible that the copy of the database
can be used to execute functional regression tests because a functional
delivery needs to be done. After Data Anonymous a check will be done by the
DBA’er in the database whether the data has been changed (anonymous). A check
about the content on surname, last name, address, house number, postal code and
residence needs to be performed. After all important items have been updated
the data anonymous process has been executed correctly. Notice: it is possible
that after this data anonymous there are still diacritics within the fields of
the database. In case of the situation that diacritics are still in the
database it’s possible to execute a functional test to test whether the
functional items are still working correctly. When this is not the case it’s
possible to make the decision to put the copy of the database back. Beside this
action it’s possible to make a small modification in the SQL script so the
diacritics will not be read out in the in software application and not attached
by using the functional items in the application.
We can say that companies will hold their promises to
following the AVG-legislation. The implementation of this legislation is often
not an easy assignment.
It is possible to execute an impact analyses and a
risk analyses within a business company to make an investigation which software
systems have personal data. This process will be followed by checking which
processes are hit by data anonymous. Then it could be possible which types of
data can be included for this data anonymous process.
Perfect agreements must be made with several users
with different roles of the application to get the right results after the data
anonymous process. A tester will get other authorisations for the application
than for example an end user. A developer or functional manager will have more
rights. For example to solve production issues. To manage all these types of
roles it is necessary to have internal coordination within the IT-department as
within Human Resources.
Furthermore it is necessary to determine which fields
or tables in the database will be ready for the data anonymous process. Some
examples of data anonymous processes items are: empty fields, suffle or reproduction
of data. There are existing tools or SQL-scripts to execute these kind
processes.
After data anonymous within the test environment it is
necessary to do functional tests to check whether all important functions work
well. It is necessary to check if data corruption will arise after data anonymous.
The subject of diacritics has his own important characteristics with this
important process of data anonymous.
Geen opmerkingen:
Een reactie posten