Life at Datategy : Dounia Lagumairi Hey , I am...Read More
Life at Datategy : James Jiang
Hey, I am James Data Engineer !
After obtaining my “Grande École d’ingénieur” diploma in Computer Science and my dual MSc degree specialized in AI, Big Data & DevOps, I decided to join the exciting career path in Data. It’s been more than 2 years professionally now and I am still so amazed about how the latest deep learning researches or big data frameworks can be used in user oriented softwares.
In real life, I am just a computer nerd spending most of my time playing e-sport games with friends, programming for personal projects GitHub and listening to Lo-fi songs.
What's your role at Datategy ?
At Datategy, I got the opportunity to have two roles on the PapAI platform project :
The first one as a Data Engineer I am responsible for adding and maintaining features related to parallel ETL or data analytics operations, It means working on the Spark service where actual computations are done, and working on the Web backend to integrate the changes .
The second one as the Lead Data Ingestion Squad I am the technical referent for the current and future ETL functionalities on the platform, It means accurately estimating the technical difficulties for a given ETL feature to update or add on the platform . The objective is to help our Product Owner to efficiently prioritize tasks in the Squad and between Squads . I am also happy to work really closely with three super developers in the squad .
What does your typical workday look like ?
My day starts at 9AM (±5%) with checking my email box and agenda first. I really like to have a personal Google Agenda to list notes, tasks done the previous day and TODOs for the current day, This eases the mental burden of remembering every detail of the previous day and effectively putting yourself in work situation .
Then after the daily scrum is over, it’s mostly three types of tasks. The first and main is programming in Scala on my Mac by listening to Lo-fi Girl and staying well hydrated with my little water bottle (≈1.5L ),Incomprehensible bugs and exceptions are part of our daily life and let’s not forget the requirements expected by our end users . The second is supervising junior developers in Scala and Spark but also learning from them . We often encounter never seen before Spark issues and take moments together to solve them. It helps us technically grow . As the Lead Data Ingestion Squad, the third is being highly available . It can be technical requests coming from our Product Owner, Tech Lead, CTO or Squad members.
And my day usually ends at 6PM (+0%~20%) with filling my agenda for tomorrow .
Tell us a little about the Datategy Ai solution
PapAI is a collaborative and intuitive ETL and ML platform for managing data projects, It allows users to create projects and build a complete data flow .
The data flow corresponds to an acyclic directed graph where each node can be a dataset or a no or low code operator, A dataset can come from an operator computation, or from several possible sources from the user. We support local file imports such as CSV or Excel format or even database imports such as PostgreSQL, It is also possible to import data from buckets. Operators can be a join for example and there are more, but I let you discover this part on our YouTube channel Datategy with real demos.
What's your favorite part of your job ?
Feeling the synergy between my teammates and me, bringing a new feature or feature upgrade after several weeks of hard work and finally making it work on the PapAI platform is the best feeling, I really like building data and user oriented softwares, and with smart people with such impressive education or professional background it is even better.
At Datategy, Scala developers numbers continue to grow, I am also part of this humble community. We are building fully asynchronous, lightweight and fast REST API microservices with Scala, Our Spark operations are also written in Scala. I like this programming language and really hope for the continuous growth of “Scalistes” at Datategy, For Python girls and boys, we also have “Pythonistes” among our ranks.
What's your next challenge on Datategy ?
In my opinion, the next coming soon challenge of Datategy will be horizontal scalability in organization and of the PapAI platform
Horizontal scalability in organization because we are growing, It means more developers joining, and all these smart people must be on-board smoothly on the PapAI project, It is not just an administrative question, but also leads and developers responsibilities – all main functions must have Docstrings or ScalaDocs with Markdowns specs and tends to a 100% code coverage on our unit tests. A proper system design is also a must if we are talking about a microservice Git repositories,This makes newcomers contributions more enjoyable and easier.
PapAI is currently usable with all our ETL and ML features through contracts and private deployment. So, Horizontal scalability of the PapAI platform because we want to bring PapAI as a complete cloud based autonomous and hyper scalable SaaS platform . Where many users can connect and easily build their own data flow with no or low code ETL and ML operators ⚙️, and just through their web interface without downloading any softwares ,Cluster resources acquisition or releasing in cores or rams will be directly on demand . This will be possible, and we have already started some rework of the current PapAI architecture (≈ Kubernetes).