What is the anonymization of personal data?

Category: Blog, IT solutions

Watching TV or reading newspapers we often come across the term personal data. Some scare us not to share it because someone will claim a credit on us, others advise us on how to effectively protect our or entrusted to us personal data. But what really is it?

What is personal data?

Personal data is information that allows a specific person to be uniquely identified.

Personal data is not first and last name, because the name Adam Johns is carried by several thousand people, but the phone number is personal data. If we leave our phone number in the wrong place, we can expect constant phone calls urging us to install solar photovoltaics, take out a cheap loan or buy a sensational product at an unbelievably low price.

It is important to remember that personal information is not treated equally. In addition to “ordinary” personal data such as name, surname and ID number, we also have a category of sensitive data. These are, for example, medical data, data related to ethnic or racial origin, political views, political beliefs, etc. This type of data should be protected in a special way.

Defining what is personal data is much more important for companies and business people. By accepting data from customers and storing it in their records, these companies build a database composed of personal data. This is necessary for their business, however, it requires additional efforts to properly secure access to this data.

It is important to remember that the issue of personal data is regulated in detail by legislation, both national and EU. You are certainly familiar with the concept of RODO, or the Regulation of the European Parliament of the Council of April 27, 206 on the protection of natural persons in relation to the processing of personal data. The Polish implementation of this directive is the Personal Data Protection Act. These are the main legal acts that protect personal data. The body that deals with the protection of personal data in Poland is the Office for Personal Data Protection. For more information, visit the Office’s website at: https://uodo.gov.pl/pl.

Protecting personal data

If we run a company it is obvious to us that we do not only store personal data of outsiders. A significant part of the data we store concerns our employees. We have to store them because we need this data for our day-to-day work. Even after an employee is fired, we have to keep this data because we are required by law to do so. Here comes the limitation of rights to delete personal data. However, it is worthwhile to properly secure such data, for example, by pseudonymizing the data.

Another issue is the scope of the data. In the first approach, we may encounter a tendency to collect as much data as possible. However, it is worth rethinking this strategy. To store personal data, especially sensitive data, is to impose additional obligations on ourselves. So it may turn out that if we limit the scope of processed data, we will not decrease the efficiency of our work and we will be able to save on data protection mechanisms.

One other point is worth remembering. Often as a company we cooperate with external entities, such as software providers. In order to cooperate effectively, these entities should be given access to our data. One thing to consider, however: do external companies need the actual data? Maybe it is enough to anonymize or at least pseudonymize this data before sending it. This way, the structure of the data will not be disturbed and only those data that we should not share will be removed.

What is anonymization of personal data?

The term anonymization refers to documents and records stored in company resources. This could be a binder with job applications, a customer contact database or an ERP system in which we store orders. There are various cases in which it may be necessary to anonymize personal data. According to the applicable data protection policy, a company may anonymize data periodically, after the termination of cooperation with the person entrusting the personal data. Anonymization can also be provoked by customers who have the right to request the deletion of their data from our system.

Therefore, it is important to remember, when we implement an IT system, to provide appropriate tools for data anonymization.

Data anonymization tools

Anonymization can take place at various levels. We can make data invisible in a computer system (e.g. ERP), physically delete data at the database level, remove data from document images (scans), etc. Each of these methods requires appropriate tools. When implementing an IT system, it is important to take this into account and choose a tool that has all the required functionality.

Anonymization vs. pseudonymization

What is the difference between anonymization and pseudonymization of data? The shortest way to explain it is that anonymization is an irreversible process, its purpose is to permanently delete data, while pseudonymization is reversible. We want to make the data harder to access but we don’t want to remove it from our system.

The two concepts are not mutually exclusive. We can use both techniques depending on our needs. A good example here is sensitive data. If we need personal data to continue our work but sensitive data, after a certain stage of work is completed, we can apply both techniques. Anonymize, that is, physically remove sensitive data, and pseudonymize the remaining personal data to reduce the risk of misuse.

The choice of treatment methods depends on the adopted data protection policy. It is worthwhile to prepare such a scenario that will ensure effective work. Pseudonymization is an additional work overhead, an employee wishing to restore the original form of data must call additional system mechanisms. If access to data occurs rarely, say once a month, then there is no problem. However, if such a need arises several times a day, it may turn out that the employee spends more time encoding and decoding information than on actual work.

Anonymization and pseudonymization techniques

Depending on the information we have, we can use different techniques to hide personal data. In the simplest case, where we are dealing with paper documents, a black marker and an employee who patiently crosses out all sensitive data from the document is usually enough to anonymize the data. In the case of information systems, of course, no one will work that way. That’s why we buy an IT system to make data anonymization automatic. But what to look for, what anonymization and pseudonymization techniques can an IT system offer? Here are the most popular techniques:

  • Randomization: replacing personal data with random data (e.g., a random string)
  • Column deletion: permanently removing data from columns containing sensitive data
  • Record deletion: deleting entire records containing sensitive data
  • Masking: replacing sensitive data with a mask, i.e., for example, a string consisting of only ‘*’ characters
  • Data disturbance: changing the data in a small way so that it is similar to the original data but does not refer to a specific person
  • Data encryption: the use of an encoding or encryption function, such as one based on a public key, which changes the data to an apparently random string. It is worth noting that this is the only reversible method of the ones outlined above, i.e. it is the only method of pseudonymizing data
  • Data erasure on attachments: a special function that modifies attachments stored in the computer system in such a way that it permanently erases selected areas of these documents.

Which technique to use? It depends on what for. Randomization preserves the uniqueness of entries so it is suitable for preparing data for transfer to external entities, deletion is the most effective but spoils the structure of documents (you can’t, for example, distinguish customers by their ID number).

Benefits of anonymizing personal data

First of all, it should be noted once again that the protection of personal data is a legal requirement. Companies that do not comply are subject to severe financial penalties. But it doesn’t stop there. Awareness of data protection is growing in society. Information about a data leak can tarnish a company’s reputation for many years.

That’s why it’s worth remembering when implementing an IT system to pay attention to functionalities related to the anonymization of personal data. With this and a wise data protection policy, we are able to avoid considerable problems in the future.

Get to know more about Electronic Workflow, AI, Business Intelligence and No-code applications in NAVIGATOR system

+ posts

Graduate of Jagiellonian University in Kraków with a degree in Informatics, Mathematics and Physics Department. In the company, he plays the role of analyst and quality control specialist. Specializes in modeling business processes, designing IT systems, and describing business requirements. Has over 20 years of experience as an IT systems designer and system analyst, has also run a big project for the construction and implementation of IT systems in the finance-accounting area. He is a lecturer at the College of Economics And Computer Science in Cracow, sharing with students his knowledge of workflow system operations, like NAVIGATOR.

Let's talk

    You might also like
    Also check