Sybil-Resilient Identifier and Proof of Uniqueness

Sybil-Resilient Identifier

Proof of Uniqueness

There are many situations where it is valuable to be able to identify a unique individual— elections, token distributions, event tickets—as a Sybil attack would allow a malicious party to gain an unfair advantage. In this write-up, we are going to assess different solutions for Sybil resistant identifiers.

First, let us talk about the model. We assume there are trusted identity providers (IDP) that issue identity credentials to users. The users want to access services for which they have to produce an identifier. The services want that these identifiers are Sybil-resistant, i.e., it should be hard for a user to produce two different identifiers for the same service. On the other hand, users would like privacy. This could mean that the identifier does not reveal the user’s identity or that identifiers are not linkable over different services. As we will see below, it is easy to achieve one of those properties, but getting Sybil-resistance and strong privacy is hard.

Reveal Attributes

We start with the simplest solution. When a user wants to sign-up with a service, they need to reveal an identity_string, e.g., consisting of[1] first_name, last_name and birthdate, from their identity credential. To prevent a malicious user from cheating, the identity string is accompanied by a zero-knowledge proof showing correctness of the revealed information. If this is the first time the service sees identity_string, they provide the user with a random identifier uid (signed by the service), and store uid,identity_string in their database.

Zero-knowledge proof. Technically, the above zero-knowledge proof shows the statement “I know a signature from the identity provider on the revealed attributes”. Using a suitable signature scheme this can be done with simple Sigma protocols.

Sybil resistance. We first note that the user cannot reveal an incorrect identity string. The zero-knowledge proof ensures that it corresponds to the information in the user’s ID credential and the model assumes that this information is correct (trusted IDPs). The Sybil-resistance now depends on the required identity string. If it is too specific, e.g., contains the passport number, a user might be able to sign up twice (e.g., as they have two passports) or not be able to sign up at all (as they might not have a passport). If it is too loose, e.g., it consists of last_name only, some users might not be able to sign up as there will be a collision with an existing user’s identity string.

Privacy. The service learns all information contained in the identity string. The user will have to trust the service provider to process and store the information in a diligent manner.

Identity string. As seen above, selecting a good identity string is crucial for Sybil resistance. Here, we discuss the advantages and disadvantages of different options.

  • First name, last name, birth date: The main advantage of this option is that it does not depend on nationality (a user might be dual-citizen) or the identity document used to register at the identity provider (a user might have a passport and a driver’s license).
    On the flip side, this can increase the likelihood of collisions. For example, with high probability there will be several James Smiths born on September 9th 1980. Transliteration can also cause issues. For example, someone’s name might be spelled Чебышёв or Chebyshev depending on the country issuing an identity document. Even if one insists on using a romanized name this issue persists (e.g. Chebyshev, Tchebychev, Chebyshov, etc.).
    Finally, with the components used in this option one easily identifies the real-world person behind the user.
  • Nationality, ID-document number: The combination of nationality and identity document number should uniquely identify the document the user showed at the identity provider. In that sense, the option provides a good Sybil-resistance.
    On the flip side, a user with different identity documents, potentially even from different countries, can easily bypass the Sybil protection. In reality, this might not be as bad, as most people will not have more than 5 such documents (e.g. two passports, one ID card, a driver’s license, and a residence permit).
    Privacy-wise, it is harder to identify a user given the information in this option (the ID number is normally not easy to find on the web). However, if the service already has more information on the user (e.g. their name), this option now carries the risk that an (inadvertent) leak might connect the ID number with other attributes of the user.

Reveal Hash of Attributes

We can do better. Let H be your favorite hash function (e.g. SHA-256). Instead of revealing the identity string to the service, users instead reveal h = H(identity string) and prove in zero-knowledge that h has been computed correctly from their identity credential. Depending on the used hash function H this will require heavier zkSNARK (zero-knowledge succinct argument of knowledge) technology. Ideally, H is chosen to be a SNARK friendly hash (e.g., Pedersen hash). The rest of the system is kept unchanged.

Zero-knowledge proof. Technically, the zero-knowledge proof shows the statement “I know a signature from the identity provider on attributes of identity string and h = H(identity string)”.

Sybil resistance. Exactly the same as in the first solution, i.e., depending on the chosen string, there will either be false positives or false negatives (or both).

Privacy. The service only learns the hash h of the identity string. While this is better than the first solution, it still has multiple vulnerabilities. If the chosen string is short enough, a brute force attack that tries all combinations can reveal all user data—as in the example in the previous section, where the identity string is composed of the nationality and ID document number. If the string is too long for that—e.g., first name, last name, date of birth—knowing partial user data still allows the brute force attack to recover the rest. It also allows tracking users. For example, two collaborating services can check if the same user signed-up with both of them as the user would reveal the same hash to them. Another issue is the low entropy of the hashed values. If a service knows the identity string attributes of a specific person (which is not too hard to find out), they can check if that person has signed-up with them. The hash h must therefore not be made public, which would prevent third-parties from carrying out these attacks, but still allows the service providers to learn the users private data.

We remark that due to the hash revealing so much information about the user data, this hash should never be stored on-chain, e.g., this solution cannot be used in a smart contract.

Context Dependent Identifier

The privacy shortcomings of the above solutions is primarily due to the fact that the revealed identifier is globally unique and can be deterministically computed from the identity attributes of the user. We do better by using more cryptography.

Pseudo-random function family (PRF). A PRF takes as input a key k and a message m and outputs value r. Roughly, the guarantee is that for any message m the output r appears random if the key k was chosen randomly. In particular, if the random key k is unknown, it is hard (for any message m) to distinguish the output r = PRF(k,m) from a uniform random value. There exist zero-knowledge friendly PRFs that allow for efficient proofs of correctness (of the PRF computation).

The overall protocol is as follows. The identity provider adds a special uniqueness key to the identity credential. Users can then use this key to generate a context dependent identifier whenever they sign up for a service.

ID credential and uniqueness key. We assume that each identity provider has a PRF key key_IDP. Whenever they issue an identity credential to a user, they also add the uniqueness key key_user = PRF(key_IDP, identity_string) to the credential where the identity_string is the same as in the above. The user needs to keep this key secret.

Context dependent identifier. When a user signs-up to a service, they reveal uid = PRF(key_user, context), where context is a string describing the service context (e.g., “MyFancyAirdrop204”). In addition the user provides a zero-knowledge proof that uid was computed correctly.

Zero-knowledge proof. Technically, the zero-knowledge proof shows the statement “I know a signature from the identity provider on key_user and uid = PRF(key_user, context). Note that both the signature and key_user are part of the witness and thus secret.

For a suitable signature scheme and PRF this can be done using Sigma protocols.

Sybil resistance. The solution has the same limitations regarding the identity_string as the above solutions. In addition, it only provides guarantees for credentials from the same identity provider.

Observe that for a fixed key_user the user’s identifier for a given context is fixed. Furthermore, the zero-knowledge proof ensures that the presented identifier has been computed from the key_user signed by the identity provider. This reduces the Sybil resistance discussion to the uniqueness of key_user.

For a given key_IDP and fixed identity_string the uniqueness key key_user is fixed. Thus, for a given identity provider the Sybil resistance reduces to the uniqueness of the identity_string. So as in the above solutions, depending on the chosen string, there will either be false positives or false negatives (or both). For example, if the identity string contains the passport number, a user with two passports will be able to get two different key_user which allows them to break Sybil resistance. On the other hand, if the identity string is just the first name, then two users called Alice will both get the same uniqueness key.

Finally, for two different identity providers (which have different key_IDP) a user will get (with overwhelming probability) different uniqueness keys even if they sign-up with the same identity string.

Privacy. In a nutshell, the use of a PRF ensures that the revealed identifier uid appears random, preventing user tracking. Given two identifiers from different contexts, it is infeasible to test whether they are from the same user or not. Furthermore, even if one knows the identity_string of a user, one cannot compute uid without knowing key_user. This prevents the brute-force attacks described above.

Note that it is safe to use these identifiers on-chain. For example, a user can publish their uid in a smart contract to participate in an airdrop.

Legacy credentials. This solution requires a new type of attribute in identity credentials. To facilitate the case where a large user base already has “normal” identity credentials one can instead issue an addon credential that provides the pseudonymity key. If this issuer for this addon credential is unique, using only this issuer instead of the IDPs it also solves the problem of different IDPs providing different uids to the same user.


  1. One needs to take care when crafting the identity string from variable length strings. Simple concatenation can cause issues as e.g. AA||B = A||AB. So instead one could concatenate the hashes of the different parts, i.e., identity_string = H(str_1)||..||H(str_n). ↩︎

Thanks for posting the proposals @tschudid :heart_eyes:

Response to Proposal: Sybil-Resilient Identifier and Proof of Uniqueness

The proposal outlines several approaches for generating Sybil-resistant identifiers, with varying levels of privacy protection. Evaluating these models against AesirX’s mission for privacy, particularly focusing on indirect zero-knowledge proofs (ZKPs), and in light of the latest FTC guidance on hashing, reveals critical privacy issues that need addressing. Below is a breakdown of the concerns and regulatory risks, along with recommendations for improvement.

1. Direct Attribute Revelation (First Model):

Privacy Issues: This approach requires users to reveal identity attributes such as their first name, last name, and birthdate to the service provider. Although zero-knowledge proofs (ZKPs) are used to verify the correctness of the revealed information, the service provider still gains access to these attributes, creating significant privacy risks. Not only does this expose users to profiling and potential data breaches, but it also places trust entirely in the service provider to handle the data responsibly.

Regulatory Risks: Under GDPR, revealing and storing such identifiable information makes the service provider a data processor, triggering obligations related to data security, transparency, and obtaining explicit user consent. This approach would also likely be considered non-compliant with the FTC’s updated stance on privacy, which stresses that even when data is obscured through techniques like hashing or pseudonymization, it is still considered personal data if it can be linked back to an individual​​​.

2. Hashing Identity Attributes (Second Model):

Privacy Issues Under New Guidance: The proposal suggests improving privacy by hashing identity attributes before revealing them to the service provider. While hashing may initially obscure the original data, the FTC guidance emphasizes that hashes do not qualify as anonymization if they create a persistent identifier that can be tracked across contexts. Hashes are susceptible to brute-force attacks or dictionary attacks, especially when common or low-entropy identity attributes are used. Additionally, consistent hashes across different services can be used to track users, thereby compromising privacy.

Regulatory Risks: Even though the hashed identity string might appear anonymous, it is still classified as personal data under GDPR and similar frameworks because it remains linkable to individuals. The FTC has explicitly stated that treating hashes as anonymous is misleading, and companies claiming that hashing de-identifies data could face enforcement actions. For GDPR compliance, treating hashed data as anonymous without the ability to fully de-link it from the original identity could lead to significant fines​​​.

Link to read more on Hashing as Identifiers.

3. Context-Dependent Identifiers with PRF (Third Model):

Privacy Improvements: This approach introduces context-dependent identifiers generated using a pseudo-random function (PRF), which adds some privacy benefits by making identifiers unique to specific services. This can prevent cross-service tracking and linkability if implemented correctly. However, the model’s effectiveness still hinges on the robustness of the PRF and how consistently the identifiers are generated across contexts. According to the latest FTC guidance, even context-dependent identifiers could be treated as personal data if they act as persistent identifiers and facilitate tracking.

Regulatory Risks: Although this model performs better from a privacy perspective, companies must ensure that PRF-derived identifiers do not create patterns that can be linked across services. If any persistent identifier is generated that allows tracking, the organization would be classified as a data processor under GDPR. This means handling these identifiers would require the same level of regulatory compliance, including transparency, obtaining informed consent, and secure data handling practices​​.

4. General Issues with Centralized Identity Providers (IDPs):

Data Processor Classification: Centralized identity providers (IDPs) that issue credentials and store uniqueness keys are likely to be classified as data controllers under GDPR. Any site or service interacting with these IDPs would be subject to joint processing responsibilities, making them liable for how user data is managed and shared. Furthermore, placing too much reliance on centralized IDPs introduces significant privacy risks, including large-scale data breaches and profiling concerns, which run counter to AesirX’s principles of decentralization and user empowerment​​.

5. Updated Privacy Considerations:

Avoid Misleading Claims About Anonymization: The latest FTC guidance clearly states that companies should not claim that hashing anonymizes data if it does not fully de-link the data from individuals. In the context of this proposal, it is crucial to avoid misleading users about the privacy benefits of these models. Persistent identifiers, whether hashed or PRF-derived, must be treated as personal data and handled accordingly​​.

Enhance Transparency and Consent Mechanisms: Even with improved privacy protections through cryptographic techniques, organizations must still be transparent about how identifiers are generated and used. Consent mechanisms should clearly explain the potential for tracking and data sharing, ensuring that users are fully informed about how their data is being processed and safeguarded​​.

Focus on Indirect Zero-Knowledge Proofs: Aligning with AesirX’s privacy-first approach, solutions should focus more on indirect ZKPs that avoid revealing or storing any sensitive information, even in hashed form. This approach minimizes the risk of tracking and re-identification, better aligning with both the GDPR and the FTC’s stricter guidelines​​.

Conclusion and Recommendations:

The context-dependent identifier model shows potential, but it must be carefully managed to ensure that identifiers are not linkable across different services. Over-reliance on hashing or centralized identity systems introduces privacy and regulatory risks that are avoidable with better privacy-by-design principles.

For AesirX-aligned implementations, the use of first-party, decentralized solutions that avoid the need for tracking or linking user data remains the most privacy-compliant approach. Implementing indirect ZKPs ensures that users’ identities remain protected while still providing the necessary verifications for Sybil resistance​​.

Outline of Proposed Model from AesirX
(Focusing only on Uniqueness of ID and Anti-Sybil with optimal compliance of privacy)

1. Mechanism of ID Generation and Validation:

Inputs: The model takes three distinct components from a user’s identity credential:

  • Base ID credentials (e.g., first name, last name, date of birth).
  • Document type (e.g., passport, driver’s license).
  • Document ID (e.g., passport number or driver’s license number).

Hashing Process: These three components are combined and hashed together using a secure cryptographic hash function. The resulting hashed value is a unique identifier (UID) that cannot be easily reverse-engineered due to the complexity of the inputs (similar to the “three-body problem” concept in physics, where the interaction between three elements creates unpredictability).

Compare Function: The hashed UID is never directly revealed. Instead, it is stored in a secure environment where it is only used for comparison purposes. For example, when a new user attempts to register, the system hashes the same three components and checks whether the resulting UID already exists in the system. If it does, it indicates that the ID has been used before, blocking duplicate registrations. This ensures that the same ID credentials cannot be reused to create multiple identities.

Indirect Proof: The process functions as an indirect proof of uniqueness because the hashed UID is only accessible via the comparison function. It is impossible to trace the UID back to the original inputs, protecting user privacy.

2. Data Processing and Collection by the Site Owner:

In the AesirX model, the site owner would collect and store the hashed UID to ensure that a user is not registering multiple wallets or credentials with the same underlying identity. However, the site owner never gains access to the original ID components, only the resulting hash.

Assessment of Privacy and Compliance

1. Privacy Strengths:

Indirect Proof with Secure Hashing: Unlike direct attribute revelation or single-component hashing (as seen in the Concordium proposal), this approach leverages the complexity of combining three components, making it extremely difficult to reverse-engineer the original inputs. The “three-body” principle creates significant unpredictability, enhancing privacy.

Minimized Data Exposure: The original ID components are never exposed, and the UID is only accessible through a comparison function. This setup significantly reduces the risk of tracking, profiling, or cross-service linkability, aligning well with AesirX’s commitment to privacy by design.

No Persistent Identifiers Across Services: Since the UID is only used within the specific context of preventing duplicate registrations, it avoids the pitfalls of cross-service tracking, a key concern in the FTC’s guidance.

2. Regulatory Compliance:

FTC and GDPR Compliance: The FTC guidance emphasizes that hashing alone does not make data anonymous if the hashed value can be used as a persistent identifier. In this model, the UID is only generated for comparison and is never exposed or used beyond this function. This greatly reduces the risk of non-compliance. However, site owners would still need to ensure that the hashing function is robust and that no information is leaked during the comparison process​​.

Data Processor Classification: In this setup, the site owner still handles the hashed UID, making them a potential data processor under GDPR. However, since they never store or access the original ID components, the compliance burden is significantly reduced. Explicit consent would still be required from users, along with transparency about the purpose and handling of their data​​.

3. Sybil-Resistance and Security:

Strong Sybil-Resistance: The combined hashing of three inputs ensures that even if a user has multiple ID documents (e.g., two passports), it is much harder for them to generate multiple UIDs that evade detection. This approach provides robust protection against Sybil attacks.

Resilience Against Attacks: The three-component structure increases entropy, making brute-force or dictionary attacks highly impractical. Additionally, the inability to reverse-engineer the UID further enhances security.

Comparison with the 3 Model Proposal

Direct Attribute Revelation (First Model):

Proposal Model 1: Exposes identity attributes directly, relying on the service provider to handle this data securely.

AesirX Proposal: Never exposes the original identity attributes; the UID is only accessible via comparison, significantly reducing the risk of data breaches or misuse.

Hashing Identity Attributes (Second Model):

Proposal Model 2: Uses a simple hash of identity attributes, which can still be vulnerable to brute-force attacks or re-identification, especially given the FTC’s concerns about treating hashes as anonymous.

AesirX Proposal: Enhances security by combining three diverse components, creating a more unpredictable and secure hashed output, aligned with modern privacy guidelines.

Context-Dependent Identifiers with PRF (Third Model):

Proposal Model 3: Introduces context-specific identifiers, which improve privacy but still rely on persistent identifiers across contexts, risking tracking.

AesirX Proposal: Avoids persistent identifiers entirely by limiting the UID’s scope to a single function (detecting duplicate registrations). This minimizes the potential for tracking or misuse.

Conclusion and Recommendations

The AesirX proposal offers a more privacy-preserving solution than the three proposed models by leveraging a multi-component hashing approach that is harder to reverse-engineer and is only used within a narrowly defined context. This aligns closely with both GDPR and FTC guidelines, reducing the regulatory risks associated with persistent identifiers.

The indirect proof model of ensuring uniqueness without exposing the original data strikes a balance between robust Sybil-resistance and user privacy, offering a superior alternative to the approaches discussed in the three model proposal​​.

By focusing on minimizing data exposure and using indirect proofs, the AesirX approach provides a stronger privacy model that remains compliant with evolving regulations and privacy best practices.

I hope i have helped to clarify why the privacy compliance aspects of designing these solutions and features are essential and why our previously proposed model since start of this year is still the better solution.

PS: Our proposal (also on this forum) also contains Seamless ID with focus on prioritizing eIDs as trusted IdentityProviderType to even enhance anti sybil attack measures, as the governmental issued eIDs are the strongest in the market and fully supports eIDAS2.0 which is the future direction EU has taken as well as the use for this as a future base to extend to Seamless KYC also, but in this response i have focused only on the first part, resolving the problem of uniqueness in the most ideal method.

Interesting proposal. I’m no real expert in the area …, but have you considered incorporating the user specific uniqueness key generation function directly into the Concordium trusted wallet instead of having the IDP generate a unique key_IDP for each user? Initially, users would still need to be verified by a trusted IDP to gain access to the wallet. After this verification, the trusted wallet could independently generate a context-dependent unique key for each user, preventing them from participating in the same contest or service multiple times. When a user signs up for a service via the wallet, they would reveal a unique identifier generated by the wallet specifically for that service context. This approach could also eliminate the need for users to manage any new secret keys, as the wallet would securely handle all key management. It will likely need to be part of the wallet SDK to support different wallets. The SDK should be responsible for generating keys and providing context-based identifiers.

@NHS The model in which we work is that the user cannot be trusted, but the IDP is trusted. The reason we don’t trust the user, is because the user has motivation for being dishonest, e.g., signing up twice to an airdrop. If the user were honest, then we wouldn’t need a proof of uniqueness at all. But we can’t do all we would want in a model where all parties are distrusted, so we add an honest IDP to get better privacy. If we don’t trust the IDP, it is essentially equivalent to no IDP involved.

As Ronni also explained, a hash of sensitive data is still sensitive data, and revealing it can result in revealing personal information. The solution to that is to add a secret key to the data being hashed, so use H(key,identity_string) instead of H(identity_string). (In the context-dependent identifier solution above, we used a PRF instead of a hash function. But for this explanation, the difference is not relevant.) Now if the user chooses key, then they can access the same service multiple times using different keys and pretending to be different people—remember, they are dishonest. That is why the best we can do without a trusted IDP is the Solution 2 above (Reveal Hash of Attributes). But if a trusted IDP generates the key, then they will always use the same key.

In your question you ask about the wallet generating the key. If people always used our wallet, then yes, it would always generate the same key. But the wallet is software run on the user’s computer, and since we don’t trust the user, they could change it to use different keys to access the same service multiple times pretending to be different people. That is why we need the key to come from a trusted source, namely the IDP.

@VikingTechGuy, could you please help me understand how and where UIDs are stored in your solution? Let me give you a very simple example. Suppose that two users Alice and Bob, want to access a service offered by Charlie (e.g., an airdrop), and Charlie wants to be sure that these are two different people, not the same person registering twice. Alice and Bob each compute UID_A = H(Alice personal data) and UID_B = H(Bob personal data). But what do they then do with UID_A and UID_B? If they send this to Charlie, then this is essentially our Solution 2 (Reveal Hash of Attributes). But if they don’t send those hashes to Charlie, then Charlie cannot compare them to know if these are two different users or the same user.

Rereading this thread, we realized that we should have added a context string to Solution 2 as well, i.e., h = H(identity_string, context). This prevents tracking between services, but it is still vulnerable to the two other privacy attacks:

  1. Brute force reversal, or partial brute force reversal if we already know part of the identity_string.
  2. Identifying that a user uses a service if we know their identity_string.

Our goal with this post is to compare Solutions 2 and 3, and to decide which we should implement. You seem to favor Solution 3 due to the extra privacy, but it also has disadvantages too. In particular, it involves a third party (the IDP) and the personal data that is hashed is fixed, the service provider cannot choose it.

EDIT: maybe not allowing the service provider to choose the data to be hashed is a good thing, because they could abuse the system by requesting a hash of small amounts of data which they could brute force. When I wrote the paragraph above, I was just thinking that there is more flexibility if the value being hashed is not fixed. But maybe hashing as much data as possible as Ronni suggested is the best thing, and there shouldn’t be another option available.

@chportma Thank you for the further clarification. My intention was to integrate this into the Concordium Wallet SDK, which would then be mandatory for anyone signing up for context-based services. I also assume that the wallet would still be used in cases where the Identity Provider (IDP) generates a new key. However, if wallets really can’t be trusted to handle key generation, then your approach seems to be the most viable option.

@NHS In the case with no key_IDP, all the user data needed to create the hash is in the wallet, so the wallet could generate this hash along with the ZK proof that is correct, and send both hash and ZK proof to the service provider. In the case of using a trusted IDP to get better privacy, the user would have to request key_user = PRF(key_IDP, identity_string) from the IDP, which is then stored in the wallet. That step has to be done only once, after which the wallet can generate the UID = PRF(key_user, context) itself along with the ZK proof that it is correct, and send these to the service providers. All this would be integrated in the wallet SDK. Did that answer your question?

1 Like

Yes makes sense. Looking forward to being able to use this at some point.

@chportma Thank you for the follow-up and your insightful example regarding Alice, Bob, and Charlie, been a busy weeks time and extended holiday weekend here but now i found the time.

In AesirX’s model, the critical point is how and where UIDs are stored and used without exposing them to unnecessary risks or undermining privacy.

UID Storage and Comparison:

  1. UID Storage Location: The UIDs (e.g., UID_A for Alice and UID_B for Bob) are not stored by the service provider (Charlie) directly, but instead, a context-dependent identifier is generated using a pseudo-random function (PRF). These UIDs are specific to the service context (for example, “Airdrop_Event_2024”) and derived from a secret uniqueness key held by the user. This ensures that even if Alice and Bob use the same service in another context, their identifiers are different and cannot be linked across contexts.
  2. Storage on User’s Side: Users (Alice and Bob) store the secret key that they receive from the Identity Provider (IDP) when their identity is verified. This uniqueness key is used to generate the UIDs for different contexts. Importantly, the actual personal data is not shared directly with the service provider (Charlie). Instead, the service only receives the UID that is specific to the service context.
  3. Service Provider’s Role: Charlie, in this case, only stores and compares the UIDs (UID_A and UID_B) without needing access to the underlying identity attributes. The UIDs are used solely for the purpose of preventing Sybil attacks within that specific context (e.g., the airdrop). Since the UID is context-dependent, it does not allow tracking across multiple services, mitigating the privacy risks associated with cross-service linkability.

Addressing Your Concerns on Solution 2 and 3:

  • Brute Force and Privacy Attacks: By leveraging a PRF with a context string and ensuring that the UID is context-specific, the risks of brute force or partial brute force attacks are minimized because the context changes with every service. The additional layer of the PRF ensures that even if a portion of the identity string is known, the UID cannot be easily reverse-engineered without the unique secret key.
  • Fixed Personal Data: While the personal data used to generate the uniqueness key is fixed, this is not shared with the service provider. Only the derived UID is revealed for a specific service context. This preserves user privacy while allowing the service provider to verify uniqueness without accessing sensitive data.

To sum up, in AesirX’s implementation:

  • UIDs are stored on the user’s side, tied to their uniqueness key, and revealed only in a context-specific way to service providers.
  • The service provider (Charlie) only compares context-specific UIDs, ensuring privacy and preventing cross-service tracking.
  • The model is designed to mitigate brute force attacks and safeguard privacy by avoiding direct exposure of personal data.

This approach optimizes both Sybil resistance and privacy compliance by leveraging decentralized solutions and context-based identifiers.

I’d like to explain how we will implement AesirX’s own proposal for UID storage and Sybil resistance, specifically in response to the example of Alice, Bob, and Charlie. I’ll also highlight how AesirX Single Sign-On (SSO) plays a vital role in the system, along with upcoming enhancements like the Company ID integration via Concordium and FinReg.

1. How AesirX Will Implement Its Own Proposal

In AesirX’s proposal, we focus on delivering a highly privacy-preserving, secure, and context-specific UID solution. This builds on the principles of UID storage and Sybil resistance while incorporating key features into AesirX SSO to streamline user identity management across multiple services.

a. UID Storage and Comparison: How We Will Do It

Our approach revolves around the use of context-specific UIDs, generated through a pseudo-random function (PRF) tied to a secret uniqueness key. This key is issued by a trusted Identity Provider (IDP) and stored securely on the user’s device.

  • Context-Specific UIDs: For every service interaction (e.g., “Airdrop_Event_2024”), a unique UID is generated that is specific to that service context. This ensures that even if Alice and Bob interact with multiple services, their UIDs will not be reused or linked across contexts.
  • Single Sign-On (SSO) Integration: With AesirX SSO, users like Alice can securely sign into multiple services using the same verified identity, without exposing personal data repeatedly. The SSO ensures that only context-specific UIDs are shared with service providers like Charlie, while Alice’s personal data remains securely stored and managed under the AesirX Shield of Privacy.
  • Privacy by Design: The AesirX Shield of Privacy ensures that the personal data behind the UID remains with the user, and service providers like Charlie only receive the UID needed for verification. Zero-Knowledge Proofs (ZKP) ensure that the service provider can validate the UID without accessing sensitive personal information, keeping user data safe.

b. User Side Control and Secure Storage

The secret uniqueness key remains under the control of the user, securely stored on their device. UIDs are generated dynamically for each service context, ensuring strong Sybil resistance.

  • First-Party Data Sovereignty: To ensure data sovereignty, all UIDs and consent data are securely stored on the AesirX First-Party Server. This guarantees that data stays under the control of the user or organization, and third-party services never gain access to sensitive information.
  • Seamless SSO Management: With AesirX SSO, Alice can use her verified identity to sign in to different services, each time generating a new context-specific UID for the interaction. This simplifies access while maintaining strong privacy and security.

c. Sybil Resistance and Security

Our proposal ensures Sybil resistance by combining PRFs and ZKP with SSO functionality to manage identity verification across multiple services without sacrificing privacy.

  • SSO and Brute Force Protection: Using AesirX SSO enhances protection against brute-force attacks by allowing secure identity verification across services without exposing personal data multiple times. The use of ZKP ensures that even if part of Alice’s identity is known, the context-specific UID cannot be reverse-engineered.

2. How We Do This Today

We have already implemented several key features of this proposal within AesirX, including SSO functionality, context-specific UID generation, and privacy-preserving data storage.

  • UID Storage Location: Today, UIDs are generated on the user’s side, tied to a secret uniqueness key. UIDs are specific to the service context, preventing cross-service tracking. This is handled securely under the AesirX Shield of Privacy.
  • Single Sign-On (SSO): AesirX SSO already provides users with seamless access across multiple services. It ensures that users like Alice can sign into various services using the same verified identity, while the system generates context-specific UIDs to ensure privacy is maintained across each interaction.
  • First-Party Data Control: AesirX First-Party Server securely stores UIDs and consent data. Data remains under the control of the user or organization, ensuring compliance with privacy laws such as GDPR.

3. Coming Soon: Company ID Integration via Concordium and FinReg

While AesirX’s core proposal already delivers privacy, Sybil resistance, and SSO functionality, we’re enhancing this with the introduction of Company ID integration.

a. What the Company ID Will Add:

  • The new Company ID (via Concordium and FinReg) will enable organizations to have a secure, verifiable identity system for managing data and assets across services. This will help companies manage their role-based access more effectively, ensuring that organizational data is handled securely.
  • Company SoP ID will add a new layer of security and governance, especially for companies operating in highly regulated sectors. Alice’s interactions as part of her company will be handled using the Organisation ID, separate from her personal User ID, ensuring role-based control over data access.

b. Role-Based Consent Management:

  • The Company ID enhances role-based consent management, ensuring that organizations can define roles and control access based on employee responsibilities. This allows companies to manage sensitive data more securely and ensures regulatory compliance.

c. First-Party Data Sovereignty with Company ID:

  • The AesirX First-Party Server will continue to securely store all UIDs and consent data, but the integration of the Company ID will allow organizations to manage their identities and data assets more efficiently. This ensures that both personal and organizational data remains under first-party control, with full transparency and privacy protection.

In AesirX’s implementation, we will:

  1. Generate context-specific UIDs using SSO functionality, ensuring seamless access across multiple services while maintaining privacy.
  2. Maintain device-aware, role-based consent management, ensuring that personal and organizational interactions are kept distinct.
  3. Enhance the system with Company ID integration via Concordium and FinReg, providing organizations with stronger tools for managing roles and access control in a secure and compliant way.

Our proposal, combined with SSO and Company ID support, ensures privacy, Sybil resistance, and data sovereignty for both individuals and organizations.

Bonus: Seamless Identity with AesirX Shield of Privacy (SoP)

An additional benefit of the AesirX Shield of Privacy is that it offers Seamless ID capabilities. With SoP, any service provider can easily whitelist users based on their SoP ID, allowing for one-click login and access to specific features, services, or assets. Here’s how this works:

  1. Whitelisting Users with SoP ID:
  • Service providers can use the SoP ID as a trusted identifier for their users. By adding users to a whitelist based on their verified SoP ID, providers can offer frictionless access to services, features, or assets, ensuring that trusted users can log in with a single click.
  • This system is particularly beneficial for services that require regular, secure access—such as subscription platforms, financial services, or asset management. By whitelisting users, these platforms can skip repetitive verification steps, providing a smoother and faster user experience without compromising on security.
  1. One-Click Login and Access to Services:
  • For end-users like Alice, this means they can access various services with one-click login without needing to repeatedly authenticate or share personal data. Once whitelisted, Alice’s SoP ID is recognized by the system, and she can easily access specific services or features that have been pre-approved by the service provider.
  • This ensures that Alice’s personal data remains securely protected while still enjoying the convenience of seamless access.
  1. Streamlined Access Across Multiple Contexts:
  • The context-specific UIDs generated by AesirX ensure that Alice’s identity remains contextually unique for each service. However, by using the SoP ID as a trusted identifier, Alice can access multiple services or features under the same system—without needing to re-verify her identity at every turn.
  • Whether she’s accessing a premium feature, logging into a new device, or unlocking specific digital assets, AesirX SoP streamlines the process while maintaining strong privacy and security protections.

How It Works:

  • Service Provider Integration: Service providers can integrate the SoP ID system into their platform, enabling one-click access for verified users.
  • User Experience: Once Alice’s SoP ID is recognized by the service, she can log in or access features without the need for additional verification steps, improving her overall experience while protecting her data.

Enhanced Convenience with Privacy

The Seamless ID functionality offered by AesirX Shield of Privacy (SoP) provides a bonus for both users and service providers. With whitelisted access and one-click login, users enjoy faster, more convenient access to services while service providers benefit from trustworthy identity verification without compromising privacy.

It seems to me that the different proposals are converging with a focus on computing the UID using PRF. The description in

seems to require a change to the IDP in order to get the secret key mentioned. It was my understanding that you were going for a solution not requiring changes to the IdP.

I believe that is achieved in the section

A few remarks on this section.

AesirX SSO will be a trusted service for the context specific identifiers and know about the users use of context specific identifiers. Given the trust and requirements that we already have to the SSO service I suppose this is acceptable for you solution.

My main comment on this solution is that the solution can probably be simplified a little as ZKP will not be needed to prove correctness for the context specific identifier. E.g., the SSO could sign the context specific identifier and then the service would know that it is valid, but this validation can also be done by other means/protocols that are all more efficient than using ZKP.

Thank you for your thoughtful feedback @tpryds . I’d like to clarify some points regarding our approach to UID storage and Sybil resistance based on the Concordium ID credentials and the necessity of ZKP for ensuring seamless compliance.

1. IDP Proposal for Uniqueness and the Role of AesirX SSO

You’re correct that we don’t want AesirX SSO to be responsible for generating the uniqueness of users. Our proposal centers on the idea that uniqueness should be derived from the Concordium ID credentials (or a new eID-based IDP), ensuring that this function is handled by the Identity Provider (IDP) itself rather than the AesirX SSO.

  • Why the IDP Needs to Handle Uniqueness: In our vision, Concordium’s ID layer or the eID-based IDP will issue a unique secret key that ensures every user has a single identity across services, without the need for changes to existing IDPs. This would ideally become a core function of Concordium’s Layer 1 or an extension to the IDP standard.
  • No Data Controller/Processor Role for AesirX or Service Providers: Neither AesirX nor the service providers (e.g., website or e-commerce platforms) want to be in the position of acting as data controllers or processors. This is why we are not proposing that AesirX SSO takes on this role. Instead, we leverage Concordium’s ID credentials to handle uniqueness, with AesirX SSO remaining focused on providing the infrastructure for secure login and access control without holding any sensitive data.

In this approach, the IDP’s uniqueness key guarantees that every user has a verifiable and privacy-preserving identity, while AesirX SSO ensures secure interactions without becoming a data controller.

2. Necessity of Indirect ZKP for Cross-Border Compliance

Regarding Zero-Knowledge Proofs (ZKP), I understand your suggestion to simplify the process by having AesirX SSO sign the context-specific UID for verification. However, we believe that ZKP remains essential in our implementation, particularly for ensuring cross-border compliance and supporting more complex verification scenarios.

Here’s why ZKP is necessary for our solution:

  • Cross-Border Compliance: For services operating across borders and jurisdictions, relying solely on the SSO to sign the UID might not meet the regulatory requirements. Indirect ZKP allows us to validate sensitive attributes—such as age or country—without exposing any personal data. This is critical for complying with data protection laws like GDPR and ensuring privacy-preserving operations across regions.
  • Handling Complex Verification Scenarios: We’ve already implemented indirect ZKP in AesirX SSO for verifying Age and Country using Concordium ID credentials. In our system, the SSO can validate that the user meets certain criteria (e.g., is above a certain age) without revealing the exact age or other personal details. We’ve also introduced AND & OR logics in AesirX SSO to handle multiple proofs or fallback levels for more complex scenarios.For example:
    • If a service requires both proof of age and nationality, AesirX SSO can handle the logic and present the required proofs through ZKP, ensuring compliance without compromising privacy.
    • In cases where a user’s initial verification method isn’t available, the fallback mechanisms ensure that verification can still occur through alternative proofs.These capabilities are critical, especially when the browser wallet doesn’t currently support more complex scenarios. ZKP allows us to offer these features without relying on the service provider to store or process sensitive user data.

3. Why ZKP is the Best Fit for This Solution

Although simplifying validation by using SSO signatures is an attractive idea, ZKP provides a higher level of assurance in terms of both privacy and compliance. Here’s why we’re sticking with ZKP:

  • Compliance Across Multiple Jurisdictions: With the variety of privacy laws (e.g., GDPR, ePrivacy, California CCPA), ZKP helps us ensure that sensitive data isn’t exposed during verification. It provides a mathematically sound guarantee that personal data remains protected, which is critical for regulatory compliance across borders.
  • Privacy-Preserving Proofs: Unlike an SSO signature, ZKP proves the validity of the identity or attribute (e.g., age, country) without revealing the underlying data. This aligns with our privacy-first philosophy, where service providers only get the information they need—nothing more.

To summarize:

  1. IDP-based Uniqueness: We are advocating for Concordium ID credentials (or an eID-based IDP) to handle uniqueness via a secret key. This avoids making AesirX SSO or service providers responsible for storing sensitive user data and eliminates the need for them to become data controllers or processors.
  2. ZKP is Essential: Zero-Knowledge Proofs remain necessary for cross-border compliance and complex verification scenarios. While signing context-specific UIDs might simplify some aspects, ZKP provides the best guarantee of privacy and regulatory compliance.

Thank you again for your feedback. I look forward to your thoughts and any further discussion on how we can refine this approach.

1. Clarifying the Need for the Uniqueness Key

Why is a Uniqueness Key Needed?

The uniqueness key field is critical for ensuring that each user has a single, verifiable identity that can’t be duplicated or misused across different services (i.e., preventing Sybil attacks). Without it, there would be no reliable way to prevent users from creating multiple accounts under the same service using different credentials.

  • Privacy and Compliance: The uniqueness key allows for context-specific UIDs without exposing sensitive personal data, ensuring GDPR, ePrivacy, and other cross-border regulatory compliance.
  • Data Sovereignty: Users and service providers (like e-commerce platforms) avoid the risks of becoming data controllers or processors by ensuring that the IDP handles uniqueness. This keeps all personal data management at the IDP level, where it belongs.

2. Seamless Integration into Existing IDP Systems

How It Works with Current Systems:

The addition of the uniqueness key field would be a minor extension to the existing IDP infrastructure, not a complete overhaul. Here’s how it can be done with minimal disruption:

  • Layer 1 Addition: For Concordium’s architecture, the uniqueness key can be introduced as a core function at the Layer 1 level. This means the key is generated and stored by the IDP upon the user’s identity verification, ensuring that the process of generating context-specific UIDs happens securely and automatically. This avoids any modification to the existing browser wallet or frontend applications.
  • Existing IDP Extension: For existing IDPs, adding a uniqueness key field can be done via a backward-compatible extension to their current identity issuance process. The uniqueness key would not change how they issue credentials; it would simply add a new cryptographic field to encode uniqueness without altering other data fields like name, address, or birthdate.
  • No Impact on Existing User Data: Importantly, this field would not require changes to existing user credentials, only affecting new identities issued by the IDP. For users who have already been verified, the uniqueness key could be added on demand as part of the user’s next interaction with the IDP, minimizing disruption to their identity credentials.

3. How It Fits into Concordium’s Architecture

Leveraging Concordium’s Built-In Privacy Layers:

Concordium’s Layer 1 already provides a strong foundation for privacy and security, making it an ideal candidate for handling uniqueness at the IDP level. Here’s how:

  • Unique Cryptographic Keys: Concordium’s Layer 1 uses cryptographic primitives that can be extended to include the uniqueness key. By building this into the Layer 1 protocol, each user’s uniqueness would be derived cryptographically, ensuring consistency across services and preventing duplications or Sybil attacks.
  • Compatibility with Existing ZKP Framework: Concordium already supports Zero-Knowledge Proofs (ZKP), and the uniqueness key would fit naturally into this framework. The ZKP can prove that the uniqueness key is valid without revealing the underlying identity data. This is important because the key will serve as the foundation for context-specific UIDs while ensuring privacy.
  • Minimal Impact on Performance: The uniqueness key can be generated during the initial identity verification process (or upon first interaction with the IDP). Once created, it becomes a persistent field that is used to derive context-specific UIDs across services. Because this process happens at the protocol level, it has minimal impact on the end-user experience or system performance.

4. Advantages of Adding the Uniqueness Key to the IDP Standard

Future-Proofing for Scalability and Compliance:

By incorporating the uniqueness key at the IDP level, the solution becomes scalable and future-proof. Here’s why this is a long-term benefit:

  • Sybil Resistance as a Standard: By embedding the uniqueness key directly into Concordium’s Layer 1 or an eID-based IDP, we ensure that all users have one unique identity, preventing Sybil attacks across multiple services. This makes the system inherently more secure, without requiring additional checks from individual service providers or relying on AesirX SSO to generate uniqueness.
  • Cross-Border Compliance: Having uniqueness handled at the IDP level ensures compliance with GDPR and other privacy regulations across borders. The uniqueness key allows the IDP to prove the user’s identity while ensuring that no personal data is leaked in the process, which is crucial for cross-border service interactions.
  • Standardization for Other IDPs: Once the uniqueness key field is added to Concordium or another IDP’s Layer 1 protocol, it sets a standard that can be adopted by other IDPs. This ensures consistency across systems and services and allows service providers to adopt the solution easily without worrying about whether a particular IDP supports the necessary fields for UID generation.

5. Addressing Potential Concerns

Why This Change is Minimal and Beneficial:

For those concerned that requiring IDP changes is a blocker, here are the key points to reassure them:

  • Backward-Compatible Extension: This proposal doesn’t overhaul the existing IDP infrastructure; it merely adds a cryptographic field that enhances uniqueness. Existing identity credentials remain valid, and users can obtain their uniqueness key through an easy extension process when they next interact with the IDP.
  • No Impact on Day-to-Day Functionality: The uniqueness key is generated automatically when a user is verified, requiring no extra steps from the user or service provider. For the end-user, the experience remains the same—they log in, verify their identity, and access services without additional complexity.
  • Critical for Sybil Resistance: Without this key, service providers (including e-commerce platforms) risk Sybil attacks and could inadvertently allow users to create multiple identities. The uniqueness key ensures that the IDP maintains control of identity verification, providing peace of mind to both users and service providers.

Summary:

By adding the uniqueness key to the IDP standard (whether at Concordium’s Layer 1 or through an eID-based IDP), we ensure strong, privacy-preserving identity verification that protects users and service providers. This extension works with existing systems, avoids requiring AesirX or service providers to become data controllers, and provides a future-proof, scalable solution that enhances Sybil resistance and cross-border compliance.

@tpryds @chportma – With the last clarification and summary, I believe we now have everything on the table for a viable model. While there are still some low-level technical details to finalize, we’ve addressed the core logic behind the proposal, particularly the focus on privacy, cross-border compliance, and the critical importance of avoiding data controller or processor risks for businesses using Concordium Layer 1 to manage their operations.

The reasoning for using Concordium’s Layer 1 to handle uniqueness ensures we can offer a scalable, compliant solution without compromising on data sovereignty. I look forward to collaborating further on this as we finalize the technical aspects.