Way to Enhance Data Quality by “Staking-as-Confidence” and Further Integration with Eigenlayer

Codatta
10 min readSep 29, 2024

--

Data quality is key to Codatta

Codatta is a universal annotation and labeling platform that turns your intelligence into AI. Our mission is to lower the barrier for AI development teams by providing inclusive access to quality data, facilitating AI advancement, and empowering individuals to contribute to AI development and enjoy long-lasting rewards for their critical contributions. We tackle challenges across various verticals, including crypto (account and user annotation), healthcare, and robotics. Our user-contributed data is on the right track to commercialization in areas like web3 ads, AML, and healthcare.

To truly grasp how user-contributed data is utilized, it’s essential to explore Data Vault, User Annotation, and Web3 ads and their interconnections in detail. The Data Vault serves as the secure, user-controlled foundation where all personal data — annotated through our User Annotation tools — is stored and managed. This vault is the backbone that enables users to not only protect their data but also monetize it through strategic applications like Web3 ads and dApp personalization. By empowering individuals to contribute their data in a structured and privacy-preserving way, Codatta seamlessly bridges the gap between data ownership and commercial utility, creating a dynamic ecosystem where users are rewarded for their contributions while maintaining full control over their personal information.

Telegram Experiment on Staking as Confidence

Recently, we experimented on Data Vault in TG. The experiment aimed to verify that under the condition of staking, users are more likely to provide authentic data. In this experiment, we introduced a staking mechanism. Using “Staking-as-confidence” to collect user demographic information without requiring the majority of users to provide documents or other proofs to ensure data quality. We aim to test if this method is effective for the Web3 user base.

Setup

We designed an experiment on Telegram to explore the use of personal data staking for targeted advertising. The user information collected (ensuring complete privacy) includes age, gender, and region. By analyzing the combinations of age, gender, and region, we aim to identify the preferences of different demographic groups for more effective targeted content delivery. A control group is also included, where participants will be shown random advertisements. We will design a product to implement a workflow where users:

  1. Self-report their gender, birth year, and living country (based on their willingness)
  2. Stake assets to join intersubjective staking
  3. See ads curated by our team at different frequencies over a significant period

Objectives

The experiment is expected to yield two key insights. First, it will examine the effectiveness of targeted content, with the hypothesis that the preference ratio for targeted content will be higher than that for randomly delivered content. Second, within the same settings, the preference ratio for targeted content among users who participate in staking is expected to be higher than those who do not.

  1. The first conclusion will demonstrate the importance of user data in enhancing the relevance of advertising in a Web3 environment.
  2. The second will highlight that staking incentivizes users to provide accurate and valuable data.

Result

* Note: The table above, stakin_assets, refers to the types of staked assets; exposure_strategy refers to the deployment strategy, where the targeting group deploys content related to users’ demographic information, and the random group deploys content using a random strategy; exposure_pv represents the total picture views, like_pv is when users see the picture and choose the like button, and like_rate is calculated as like_pv/exposure_pv; targeting_uplift indicates the effectiveness improvement brought by targeted deployment, which can be interpreted as the traffic conversion gain from using user annotation versus not using user annotation in advertising deployment.

We initiated our experiment on Telegram on August 22. The results are as follows: Out of 30,834 participants, 19,069 submitted valid information, 13,275 completed ad viewing, and 2,036 staked assets (2,036 users staked Codatta platform points, while 26 users staked USDT).

Observation 1: Among these, the targeting_uplift for no staking is 60.43%, the uplift for staking Codatta points is 66.27%, and the uplift for staking USDT is 322.23%. Based on these three numbers, the following conclusions can be drawn:

  • With the help of user annotation data, the targeted investment strategy can effectively improve traffic efficiency by 60% — 322% or more.
  • As the gold standard linkage of the staked assets becomes more explicit, the corresponding uplift value increases. It can be inferred that as the seriousness of the staked assets increases, people’s seriousness in judging advertising content also increases.

Observation 2: The no-staking-targeting group has a like_rate of 66.82%, while the staking-targeting group (Codatta points + USDT) has a like_rate of 69.14%. Based on this, we can infer: The higher accuracy rate of the staking group indicates that user annotation data provided under staking conditions is more accurate.

Based on our experiments, we believe that user annotation data, through the Vault model, can bring significant efficiency improvements in ads and recommendation scenarios while ensuring privacy and user authorization. This, in turn, allows users to receive incentives for their data contributions through vaults. The economic behavior of “staking as confidence” effectively enhances data credibility. Combined with the Codatta Data Vault product and the logic of “staking as confidence,” we can maximize data quality assurance and provide value for downstream applications such as AI and Ads. Therefore, we need a set of tools to enhance the reliability of user data and the security of staking within the protocol.

Utilize Eigenlayer to enhance Codatta’s data quality and security

Our understanding of Eigenlayer

To implement “staking as data confidence” more securely and conveniently, we engaged in in-depth discussions with EigenLayer and their ecosystem team, Othentic. EigenLayer, implemented through smart contracts, allows users to delegate their ETH to these contracts for restaking. This mechanism not only increases potential returns for stakers but also provides additional security for new modules in the Ethereum ecosystem.

EigenLayer v1’s restaking extends Ethereum’s security consensus while enhancing Ethereum’s liquidity. EigenLayer v2’s intersubjective staking addresses subjective attribution errors through Ethereum’s social consensus, resolving subjective attribution faults within the system through EIGEN/bEIGEN token forks.

EigenLayer can be viewed as a consensus extension tool, with stETH and op backing for AVS (actively validate service). Rather than attracting staking nodes from scratch, it’s more efficient to utilize their ready-made solutions. Additionally, EigenLayer serves as a security strategy tool, covering scenarios of on-chain errors or forks. Ecosystem applications don’t need to implement their own handling of subjective or objective errors, as EigenLayer provides reward and punishment mechanisms to prevent collusion.

Further Integration with Eigenlayer

As described above, data quality is crucial to the protocol. We believe a staking approach will improve the accuracy of user-provided data.

In detail:

  1. Each user decides to submit their information through User Annotation. Before submitting the information, they can choose whether to stake their assets. If they stake assets for the annotation, the information is more credible. Authentic user annotations can significantly enhance the value in various business scenarios.
  2. Users can stake to ensure the authenticity of their submitted User Annotations, thereby reducing the cost of obtaining accurate information (lowering the difficulty of requiring users to submit real identity information). Conversely, if a staked User Annotation is found to be incorrect, the staked assets of the corresponding user will be slashed.

So, we design our onchain process based on Eigenlayer as the following flowchart (User’s demographic annotation data as an example):

At present, we have confirmed the feasibility of our concepts and plans with the Eigenlayer and Othentic teams, and they are quite viable. The next step is to implement the solutions outlined in the architectural diagram.

Staking and Slashing Mechanism with EigenLayer

We aim to implement this staking-as-confidence mechanism within the EigenLayer environment, where rewards and slashes are directed at Node Operators. Individual stakers within the Operator cannot be targeted separately.

Three possible staking & slashing implementation mechanisms:

  • Solution 1: Each user who contributes data becomes a Node Operator, and this Operator can only secure Codatta AVS. Thus, the number of Node Operators would match the scale of the stakers. However, user’s may have challenges with setting-up operator nodes.
  • Solution 2: There is only one Codatta Node Operator, so all users’ assets are staked to this Operator. However, the reward/slash behavior of this Operator needs to be customized. For example, if a user’s annotation is found to be fraudulent, the slashed funds would go to the Operator, who can then slash the corresponding user. This ensures that the slashing mechanism can be applied to a specific user.
  • Solution 3: Combine Solution 1 and Solution 2. Not all users need to become Operators. Users with significant data contribution capabilities and asset strength can act as Operators, while other users can choose to stake with these Operators. These Operators will then secure Codatta AVS on behalf of the users. Of course, in the event of a slashing scenario, the Operator will pass the slashing down to the assets of the specific users involved.

After discussing with the Eigenlayer and Othentic team. At the implementation level, we decided to choose solution 3 as illustrated in the above image. Since there is currently no slash interface implemented in EigenLayer, all users related to slashing will be handled by the delegated operator. Users will stake their assets to the operator, who will be responsible for submitting the right data and ensuring its quality. When a data challenge occurs and is validated successfully, the operator will be slashed and will deduct the corresponding amount from the stake account of the users they represent. We’ll implement the AVS shortly.

Conclusion

Codatta has successfully demonstrated the potential of its innovative “staking-as-confidence” mechanism in improving data quality and user engagement in the Web3 environment. The Telegram experiment yielded significant insights:

  • Targeted content delivery based on user-provided data showed a substantial increase in engagement, with uplifts ranging from 60% to 322%.
  • Users who staked assets provided more accurate data, as evidenced by the higher like rates in the staking group (69.14%) compared to the non-staking group (66.82%).
  • The value of the staked asset correlated with the quality of user engagement, suggesting that higher-value stakes lead to more thoughtful interactions.

These findings validate Codatta’s approach to data collection and utilization, highlighting the potential for a new paradigm in user data management that balances privacy, accuracy, and user incentives. We have confirmed the feasibility of our concepts and plans with the Eigenlayer and Othentic teams, and they are quite viable. The next step is to implement the solutions outlined in the architectural diagram.

Future Works

Building on the success of the initial experiments, Codatta depicts its future roadmap that includes:

  • Integration with EigenLayer: Implementing Solution 3 for staking and slashing mechanisms, leveraging EigenLayer’s infrastructure to enhance security and scalability.
  • Expanding Data Types: Broadening the scope of collectible data beyond demographic information to include behavioral and preference data will further enhance the value proposition for both users and advertisers.
  • Refining Reward Mechanisms: Developing more sophisticated reward structures that align with the quality and quantity of data provided, encouraging sustained user engagement.
  • Cross-Chain Compatibility: Exploring integration with other blockchain networks to increase accessibility and broaden the user base.
  • Privacy Enhancements: Continuing to innovate in privacy-preserving technologies to ensure user data remains secure while maintaining utility.
  • AI Integration: Leveraging artificial intelligence to improve data analysis and targeted content delivery, enhancing the overall efficiency of the platform.

These future works aim to solidify Codatta’s position as a leader in decentralized data management and targeted advertising, fostering a more efficient, transparent, and user-centric digital ecosystem.

About Codatta

Codatta is a universal annotation and labeling platform that turns your intelligence into AI.

Our mission is to lower the barrier for AI development teams by providing inclusive access to quality data, facilitating AI advancement, and to empower individuals to contribute to AI development and enjoy long-lasting rewards for their critical contributions. We tackle challenges across various verticals, including crypto (account and user annotation), healthcare, and robotics. Our user-contributed data is on the right track to commercialization in areas like web3 ads, AML, and healthcare.

Stay Connected with Codatta

Follow us on social media for the latest news, insights, and developments about our innovative projects. Join our growing community below and don’t forget to like, comment, and share our posts to help spread the word!

🌐 Website|🆇 Twitter|💬 Telegram|👾 Discord|📱App

--

--

Codatta
Codatta

Written by Codatta

Codatta is a permissionless marketplace connecting data creators with demanders to curate valuable data resources, assetified on the XnY network.

No responses yet