Professional with 6 years of experience in AI/machine learning/data science roles and 4 in research roles.
Work experience with:
AI Agents & Generative AI:
• Large Language Models (LLMs) and Small Language Models (SLMs)
• Retrieval-Augmented Generation (RAG)
• Low-Rank Adaptation (LoRA)
Frameworks & Tools:
• LangGraph, LangChain, CrewAI, LangSmith, Langflow
• LLM Routing
Machine Learning & AI:
• Deep Learning (PyTorch, TensorFlow)
• Supervised and Unsupervised Learning (Classification, Regression)
• Natural Language Processing (NLP) and Language Models (LLMs, SLMs)
• Neural Networks (Transformers, RNNs, CNNs)
Data Engineering & Pipelines:
• ETL/ELT, Data Lakes, and Data Warehousing
• Cloud: AWS (S3, Glue, Redshift, Lambda, Bedrock), GCP (Vertex AI, BigQuery, Dataflow)
• Distributed Processing: Apache Spark
• Orchestration: Apache Airflow
• Databases: SQL, NoSQL (MongoDB)
MLOps & Deployment:
• CI/CD Pipelines (GitHub Actions, GitLab CI/CD)
• Containerization and Orchestration: Docker, Kubernetes
• Model Deployment: FastAPI, Flask, MLflow
• Monitoring and Scaling: Grafana, AWS SageMaker, GCP AI Platform
Graduate degree in Data Science; Master studies in Aerospace Engineering.
Client - OpenAI
▪ Developed and optimized algorithms to train highly autonomous systems (ChatGPT) and generative AI APIs, adaptable to a wide range of language tasks and capable of processing millions of production requests daily.
▪ Solved coding challenges and conducted code reviews prior to integration into the model/chatbot pipeline.
▪ Curated data and prepared datasets to enhance model performance in browsing and search capabilities, rule adherence, and the generation of custom files (.csv, Excel, YAML, XML) tailored to specific user needs.
▪ Developed unit tests to assess code accuracy in Python (pandas, FastAPI, TensorFlow), JavaScript, HTML, and CSS, ensuring robustness across diverse programming environments.
▪ Analyzed and evaluated model-generated code, identified errors, and crafted optimal responses to enhance fine-tuning efforts.
▪ Created and implemented intentional bugs to enhance LLM models' ability to detect and correct errors, improving overall model reliability in debugging scenarios
▪ Led the data team, working on machine learning/deep learning, ETL, MLOps, and data visualization challenges to provide a source of accurate and reliable environmental information about the field, transforming the raw data collected in the field into useful information for decision-making by farmers.
▪ Built classification models for detecting crops from satellite images. Built classification models to determine rainfall and regression models to calculate rainfall. The models are used by more than 500 products and hundreds of customers around the world, from family farmers to giants such as Bayer, John Deer and Banco do Brasil.
▪ Developed a world-first technique that determines rainfall (through a patented rain sensor) in near real-time.
▪ Developed soil models that enable our sensors to estimate water storage in different soil types. Its calibration is made by combining agronomic water balance models with machine learning techniques.
▪ Other models:
o Productivity models for soybean and cotton
o Harvest date estimation
o Crop segmentation and classification (soybean, cotton, corn, coffee, wheat)
o Deep learning models for pest, disease, and weed detection
o Deep learning models for fruit detection in orchards (apple, peach, citrus, grape, tomato)
▪ Technologies: Python, Pytorch, Google Cloud Platform, SQL, Influxdb, Git/Bitbucket, Grafana, CI/CD, Docker, API
REST, Linux Shell, IoT.
▪ Mentored junior data scientists who joined the team.
▪ Computer vision projects:
o Built an OCR (Optical Character Recognition) tool, for the HR area of the largest private-bank in Latin America, for automatic treatment, processing, and postprocessing of employees and employees' family personal documents (brazilian documents: RG, Drivers License, Certificate of Birth/Marriage/Death, among others). I also built a multiclass classifier (for personal documents), using Convolutional Neural Networks.
o Developed an OCR software package for the tax payments department that allowed automatic extraction and field recognition of city tax invoices, accelerating the payment process and reducing legal risk.
o Results: That project reduced and reallocated the area headcount for the admission, demission, reimbursements, and health benefits departments, saving about US$ 500,000 per year.
o Technologies: Tesseract, Pytorch, TensorFlow, OpenCV, CNN, ResNet, VGG, YOLO, LabelImg/Label Studio annotation toolset, NLP.
▪ Time series analysis and forecasting projects:
o Developed (with a team) an automatic analysis and anomaly detection tool for the Directory of Payments and Operations, which processes financial account time series (4+ months historical) and outputs anomalies and forecasts risk levels. I also developed a web app that runs these models in the background.
▪ Computer optimization projects:
o Optimized the processing of a class of investment products, the Certificate of Structured Operations. We used Monte Carlo simulation to reduce processing time, which I adapted in PyTorch to run on GPU.
o Results: Processing time was reduced from 4 minutes to 40 seconds, improving performance by 6 times, resulting in US$ 3,000,000.00 NPV in 5 years.
▪ Selected (to one of 30 openings) from more than 6000 candidates for the ITAú Analytics training program, a partnership between Itaú Unibanco (for work as a data scientist) and Aeronautics Institute of Technology (for graduate studies).
▪ Worked at the Advanced Analytics center, where I participated in the computer vision development of tools and internal packages.
▪ Developed of an Excel tool (VBA) for managing Asset Engineering projects;
▪ Worked on projects with the Business Product team to solve the main customer complaints using Scrum and Agile Project Management
▪ Awarded a federal Graduate Research Fellowship from CAPES Brazil.
▪ Research topics: Optimization and orbital dynamics. Spacecraft navigation.
▪ Two papers were published. Two conference presentations.
▪ Teaching Assistant - Course: "MVO41 - Flight Dynamics" (Bachelor in Aerospace Engineering)