Marcus

Latin Americas
Brazil

$65/hour

English
About me

Professional with 6 years of experience in AI/machine learning/data science roles and 4 in research roles.

Work experience with:

AI Agents & Generative AI:

• Large Language Models (LLMs) and Small Language Models (SLMs)

• Retrieval-Augmented Generation (RAG)

• Low-Rank Adaptation (LoRA)

Frameworks & Tools:

• LangGraph, LangChain, CrewAI, LangSmith, Langflow

• LLM Routing

Machine Learning & AI:

• Deep Learning (PyTorch, TensorFlow)

• Supervised and Unsupervised Learning (Classification, Regression)

• Natural Language Processing (NLP) and Language Models (LLMs, SLMs)

• Neural Networks (Transformers, RNNs, CNNs)

Data Engineering & Pipelines:

• ETL/ELT, Data Lakes, and Data Warehousing

• Cloud: AWS (S3, Glue, Redshift, Lambda, Bedrock), GCP (Vertex AI, BigQuery, Dataflow)

• Distributed Processing: Apache Spark

• Orchestration: Apache Airflow

• Databases: SQL, NoSQL (MongoDB)

MLOps & Deployment:

• CI/CD Pipelines (GitHub Actions, GitLab CI/CD)

• Containerization and Orchestration: Docker, Kubernetes

• Model Deployment: FastAPI, Flask, MLflow

• Monitoring and Scaling: Grafana, AWS SageMaker, GCP AI Platform


Graduate degree in Data Science; Master studies in Aerospace Engineering.

Skills
Machine Learning
80.0%
(6yrs)
Skill Python Python
90.0%
(6yrs)
Data Science
80.0%
(6yrs)
Scientific Research
60.0%
(3yrs)
Data Analytics
80.0%
(8yrs)
Skill FastAPI (python) FastAPI (python)
80.0%
(5yrs)
Tensor Flow
60.0%
(4yrs)
SQL
80.0%
(6yrs)
Artificial Intelligence
80.0%
(6yrs)
HTML/CSS
70.0%
(3yrs)
Skill Javascript Javascript
70.0%
(3yrs)
Skill Django Django
70.0%
(3yrs)
data modelling
70.0%
(6yrs)
Prompt Engineering
80.0%
(6yrs)
Large Language Models (LLM)
80.0%
(6yrs)
LLM Training
80.0%
(6yrs)
Agentic AI
80.0%
(3yrs)
Langchain
80.0%
(3yrs)
LangGraph
80.0%
(3yrs)
Experience
Python/AI Engineer | Turing
Jan 2023 - Present

Client - OpenAI

▪ Developed and optimized algorithms to train highly autonomous systems (ChatGPT) and generative AI APIs, adaptable to a wide range of language tasks and capable of processing millions of production requests daily.

▪ Solved coding challenges and conducted code reviews prior to integration into the model/chatbot pipeline.

▪ Curated data and prepared datasets to enhance model performance in browsing and search capabilities, rule adherence, and the generation of custom files (.csv, Excel, YAML, XML) tailored to specific user needs.

▪ Developed unit tests to assess code accuracy in Python (pandas, FastAPI, TensorFlow), JavaScript, HTML, and CSS, ensuring robustness across diverse programming environments.

▪ Analyzed and evaluated model-generated code, identified errors, and crafted optimal responses to enhance fine-tuning efforts.

▪ Created and implemented intentional bugs to enhance LLM models' ability to detect and correct errors, improving overall model reliability in debugging scenarios

I
Data Science Consultant | Independent
Jan 2023 - Present


Head of Data and Data Scientist | FieldPRO · Full-time
Jun 2022 - Dec 2022

▪ Led the data team, working on machine learning/deep learning, ETL, MLOps, and data visualization challenges to provide a source of accurate and reliable environmental information about the field, transforming the raw data collected in the field into useful information for decision-making by farmers.

▪ Built classification models for detecting crops from satellite images. Built classification models to determine rainfall and regression models to calculate rainfall. The models are used by more than 500 products and hundreds of customers around the world, from family farmers to giants such as Bayer, John Deer and Banco do Brasil.

▪ Developed a world-first technique that determines rainfall (through a patented rain sensor) in near real-time.

▪ Developed soil models that enable our sensors to estimate water storage in different soil types. Its calibration is made by combining agronomic water balance models with machine learning techniques.

▪ Other models:

o Productivity models for soybean and cotton

o Harvest date estimation

o Crop segmentation and classification (soybean, cotton, corn, coffee, wheat)

o Deep learning models for pest, disease, and weed detection

o Deep learning models for fruit detection in orchards (apple, peach, citrus, grape, tomato)

▪ Technologies: Python, Pytorch, Google Cloud Platform, SQL, Influxdb, Git/Bitbucket, Grafana, CI/CD, Docker, API

REST, Linux Shell, IoT.

Data Scientist and Machine Learning Engineer | Itaú Unibanco SA
Sep 2018 - Jun. 2022

▪ Mentored junior data scientists who joined the team.

▪ Computer vision projects:

o Built an OCR (Optical Character Recognition) tool, for the HR area of the largest private-bank in Latin America, for automatic treatment, processing, and postprocessing of employees and employees' family personal documents (brazilian documents: RG, Drivers License, Certificate of Birth/Marriage/Death, among others). I also built a multiclass classifier (for personal documents), using Convolutional Neural Networks.

o Developed an OCR software package for the tax payments department that allowed automatic extraction and field recognition of city tax invoices, accelerating the payment process and reducing legal risk.

o Results: That project reduced and reallocated the area headcount for the admission, demission, reimbursements, and health benefits departments, saving about US$ 500,000 per year.

o Technologies: Tesseract, Pytorch, TensorFlow, OpenCV, CNN, ResNet, VGG, YOLO, LabelImg/Label Studio annotation toolset, NLP.

▪ Time series analysis and forecasting projects:

o Developed (with a team) an automatic analysis and anomaly detection tool for the Directory of Payments and Operations, which processes financial account time series (4+ months historical) and outputs anomalies and forecasts risk levels. I also developed a web app that runs these models in the background.

▪ Computer optimization projects:

o Optimized the processing of a class of investment products, the Certificate of Structured Operations. We used Monte Carlo simulation to reduce processing time, which I adapted in PyTorch to run on GPU.

o Results: Processing time was reduced from 4 minutes to 40 seconds, improving performance by 6 times, resulting in US$ 3,000,000.00 NPV in 5 years. 

Data Analyst and Data Scientist Trainee | Itaú Unibanco SA
Apr. 2019 - Dec. 2019

▪ Selected (to one of 30 openings) from more than 6000 candidates for the ITAú Analytics training program, a partnership between Itaú Unibanco (for work as a data scientist) and Aeronautics Institute of Technology (for graduate studies).

▪ Worked at the Advanced Analytics center, where I participated in the computer vision development of tools and internal packages. 

Product Engineer Intern | Itaú Unibanco SA
Set. 2018 - Apr. 2019

▪ Developed of an Excel tool (VBA) for managing Asset Engineering projects;

▪ Worked on projects with the Business Product team to solve the main customer complaints using Scrum and Agile Project Management

Researcher and Teacher Assistant | Aeronautics Institute of Technology
Jan. 2014 - Mar. 2018

▪ Awarded a federal Graduate Research Fellowship from CAPES Brazil.

▪ Research topics: Optimization and orbital dynamics. Spacecraft navigation.

▪ Two papers were published. Two conference presentations.

▪ Teaching Assistant - Course: "MVO41 - Flight Dynamics" (Bachelor in Aerospace Engineering)