Volver

What are synthetic identities and how are they used to commit fraud?

One of the methods of identity-related fraud that has gained more prominence in recent years has been the creation of synthetic identities. These identities use attributes of real identities, mixed with false attributes and presented to entities to commit fraud on behalf of other people. In this paper, we discuss how synthetic identities are created and the mechanisms that help to build trust in the identities presented. 

The term synthetic identities refer to identities created by fraudsters, which mix attributes of real people with fictitious attributes to open financial products. The reason why real attributes are used is that in this way it is possible to overcome some security filters and thus strengthen the synthetic identity to use it at a given time to defraud businesses or financial institutions. As it is to be expected, by using attributes of the identity of real people, innocent people are associated with the fraud and are seriously harmed, sometimes finding out years after the fraud has been committed. 

According to the National Credit Union Association [NCU2020], in the United States this problem has been accentuated in part because the primary identifier is the social security number, which currently by itself does not allow inferring any type of attribute of the person, which can lead, for example, to the use of a child’s social security number to open an account. Additionally, this type of fraud is not usually carried out by an isolated person, but rather by coordinated groups that help to robust these false identities, raising the level of fraudulent transactions [CAR2019]. This is understandable when observing that losses associated with this type of fraud were estimated at $820 million USD in 2017, and projected at $1200 million by 2020 [CON2018]. A well-known case occurred in 2013, when a group of 13 individuals created 7 thousand fake identities to steal $200 million USD [FBI2013]. 

As mentioned above, the creation of a synthetic identity requires some real attribute of an existing person. This attribute of origin can be contrasted by a financial institution, and other false attributes begin to be related to it, which will serve the fraudster to strengthen the identity and make the institutions believe that they are interacting with a real person. Due in large part to the constant information leaks suffered by companies that store personally identifiable information, attributes such as names, addresses, document numbers or even bank account numbers can be associated with a physical identity. If this information is presented to an entity that does not perform adequate management, this information could be used to open a bank account. With this account, a microcredit could be opened and repaid, which would strengthen the confidence in this identity. As expected, some attributes such as address, email or photo of the individual would be replaced with information from the fraudster.  Once there is sufficient trust, or the credit score is high enough, the fraudster could take out a loan that he would not repay, executing the fraud.  As can be seen, the success of synthetic identity is based on establishing trust over a set of attributes that are wrongly associated with a physical identity. Therefore, solutions that seek to detect a synthetic identity are usually based on the evaluation of the attributes of the identity. 

How to detect synthetic identities? 

As mentioned in the previous section, synthetic identity fraud is based on an error in trusting that some attributes can be associated with a physical identity. The fraudster skillfully builds trust on these attributes, based on an initial interaction where a real attribute is used. Therefore, it makes sense that solutions that seek to detect a synthetic identity rely on analyzing the quality of the attributes and how to correctly assign a degree of trust to each of them. 

In [MCK2019], the difficulty for fraud detection systems based on machine learning to detect this type of cases is highlighted because they are often confused with non-payment cases, generating difficulties for training. To avoid rejecting real customers, they propose an attribute evaluation based on 2 dimensions: depth and consistency. By depth they refer to the history of the attribute. For example, in the case of a telephone number, it may be useful to know if it has existed for a considerable time. Consistency refers to the relationships that exist between different sources of information, and how they coincide in the attributes they present. For example, if you see that there is a match between the physical address recorded in the bank account and the information presented in a form, you can say that it is an attribute that presents consistency. By taking each attribute, and evaluating these two characteristics, an overall assessment of the identity presented can be made, and according to them, effectively mitigate the risk of accepting a synthetic identity through a risk model that punishes shallow and inconsistent identities. 

In [SOE2014], an “identity ecosystem” is presented, which allows us to evaluate the risk level of the attributes presented, based on the probability of them being compromised. As mentioned in the previous section, unfortunately, attributes of our identity may be available to third parties. If there is certainty that a specific attribute is available to third parties, it is to be expected that an entity will consider that this attribute alone is not sufficient to prove a person’s identity. Therefore, what this solution seeks to do is to quantify the level of risk of each of the attributes based on how these attributes are presented and how they relate to each other. For example, a Colombian citizenship card contains the first names, last names and date of birth of a person. Therefore, if someone’s ID card is lost, it is to be expected that a third party who has that person’s names will also have the date of birth. But just as the cédula exists, there are many other sources of personally identifiable information that relate the attributes of an identity. This allows to have a model that assigns a risk level for each attribute, indicating to the entity the degree of confidence it can have in the identities presented. 

On the other hand, Equifax [EQU2019] proposes an attribute assessment to detect synthetic identities based on the generation of profiles and their relationship with attributes generated by different information sources. The objective of this system is to detect inconsistencies between the attributes presented by the different information sources, or relationships between the attributes presented and those of those profiles that have been related to some type of fraud. This last aspect is relevant because one of the characteristics of fraud based on synthetic identities is that it usually relies on profiles that have a high credit score. In this way, it is easier to identify patterns of abuse. 

Conclusions 

The use of synthetic identities is based on the difficulty of associating a level of confidence to the attributes presented, since many times these cannot be corroborated as in the case of social security numbers in the United States. Therefore, it is necessary to evaluate the confidence given to each of the attributes presented. The solutions observed use criteria such as consistency and depth to ensure the quality of the attributes associated with the identity presented. This becomes very important if we take into account that many of the attributes of our identity are out of our control due to mishandling of our personal data, information leaks, or simply by sharing our personal data in public places. 

Diego Pacheco-Páramo 

Translated by: Anasol Monguí

Bibliography 

[NCU2020] Synthetic Identities Are One of the Fastest Growing Forms of Identity Theft. National Credit Union Administration Report. Februrary 2020. https://www.ncua.gov/newsroom/ncua-report/2018/synthetic-identities-are-one-fastest-growing-forms-identity-theft  

[CAR2019] Tutorial : why Fraud Matters and what to do about it. J. Care. Gartner. 2019  

[CON2018] Synthetic Identity Fraud: The Elephant in the Room.” J. Conroy. Digital Banking Customer Engagement: Aite Group, 3 May 2018   

[FBI2013] Eighteen People Charged in International $200 Million Credit Card Fraud Scam. FBI https://archives.fbi.gov/archives/newark/press-releases/2013/eighteen-people-charged-in-international-200-million-credit-card-fraud-scam  

[MCK2019] Fighting back against synthetic identity fraud. B. Richardson and D. Waldron. Mc Kinsey & Company. Enero 2019  

[SOE2014] TRUSTWORTHINESS OF IDENTITY ATTRIBUTES.B. Soeder y K. Barber.  SIN ’14: Proceedings of the 7th International Conference on Security of Information and Networks. Septiembre 2014.  

[EQU2019] SYNTHETIC ONLINE ENTITY DETECTION. Pub . No . : US 2019 / 0164173 A1.  Equifax Inc., Mayo 2019