Real-Time Data Collection: Node.js and Asynchronous Programming Insights

API Data Collection
Data Collection
Javascript
Node.js
Real-time Data Collection
Research
Blockchain
Published

June 28, 2024

Thumbnail designed by Freepik.

Introduction

When I began my research on Ethereum, I started by studying the works of researchers who preceded me. That being done, I started replicating them to learn and experiment with their results. One particular study stood out in this scenario: Azevedo Sousa et al. (2021). It aimed to correlate the pending time for Ethereum transactions with the offered gas fee.


See on Github:

Some theory behind it

Before going further to the problems I faced, it’s essential to understand the concepts of pending time and gas fee in Ethereum transactions. Pending time refers to the duration a transaction spends waiting in the queue to be included in the network. The gas fee, on the other hand, is the incentive paid to miners to process and validate transactions. Higher gas fees were known to typically lead to faster transaction times, as miners prioritize transactions that offer greater rewards. That general knowledge was what the study was attempting to validate.

Example of a transaction waiting to be accepted in the Blockchain

The challenges

I tried analyzing newer transaction data using the same methodology, to see if the trends identified in the earlier study still held. However, I encountered a significant hurdle: APIs available for Ethereum data did not provide information on pending times. Additionally, I couldn’t scrape the old data anymore due to changes in the web infrastructure. What could I have done, then?

Real-Time Crawler

To overcome this limitation, I decided to develop a crawler that would track new transactions in real time. With this approach, I could calculate the time it took for each transaction to be approved and added to the blockchain. For this task, I utilized the web3.eth library for web3js. This was my first encounter with Node.js, which was shown to me as a new learning curve.

One of the most challenging aspects of this project was dealing with asynchronous programming. I needed to store data in real time while simultaneously tracking new transactions. Managing these concurrent processes required me to learn about asynchronous programming techniques, which was a steep learning curve for me at that moment.

What I got out of it

Although the data collected from this project has not been used in any published papers, the experience was incredibly valuable for my personal and professional growth. It provided me with hands-on experience in real-time data tracking, asynchronous programming, and working with new technologies. This project has also laid a foundation for possible future research endeavors in the field of blockchain and cryptocurrencies. Any feedbacks or suggestions are welcome! The project is available in Github.

References

Azevedo Sousa, José Eduardo de, Vinı́cius Oliveira, Júlia Valadares, Glauber Dias Goncalves, Saulo Moraes Villela, Heder Soares Bernardino, and Alex Borges Vieira. 2021. “An Analysis of the Fees and Pending Time Correlation in Ethereum.” International Journal of Network Management 31 (3): e2113.