Firmenlogo

Hybrid Machine Learning Data Analyst Machine Learning Data Analyst

Swooped  ·  nan, états-Unis d'Amérique · Hybrid

Postuler maintenant

About the job

About Our Client

A highly motivated and experienced Data Analyst is sought to join a dynamic AI & Threat Analytics team. This role will be instrumental in driving the development of autofill classification models through effective management, optimization, and analysis of datasets. This position is fully remote, with potential for a hybrid schedule for candidates located in the El Dorado Hills, CA, or Chicago, IL metro areas.

Our Client offers cybersecurity software that is trusted by millions of users and thousands of organizations worldwide. The solutions are available in 21 languages and sold in over 120 countries. This is an opportunity to join one of the fastest-growing cybersecurity organizations and contribute to the data management and preparation that will propel next-generation autofill and classification models.

About the Role

In the role of Data Analyst on the AI & Threat Analytics team, you will oversee the entire lifecycle of data collection, preprocessing, and management for machine learning models. Collaborating closely with engineers, you will leverage HTML and DOM structures to build and refine datasets, while conducting feature engineering experiments to improve autofill classification models. Additionally, you will generate synthetic datasets using LLMs and investigate complex data patterns to optimize model performance. This position plays a critical role in advancing ML initiatives by ensuring access to the highest quality data.

Responsibilities

- Own the complete data collection, cleaning, and preprocessing pipeline for HTML-based datasets utilized in machine learning applications.

- Employ web analysis tools to extract and structure data from DOM environments for model training and validation.

- Collaborate with ML Engineers to support feature engineering experiments and produce training datasets that align with model requirements.

- Generate and enhance synthetic datasets using LLMs to improve the balance and availability of training data.

- Analyze data through dimensionality reduction techniques (e.g., t-SNE, PCA, UMAP) to assess feature strengths and elevate dataset quality.

- Automate data workflows to optimize data processing, manipulation, and transformation processes.

- Develop and maintain thorough documentation for data workflows, processes, and methodologies to ensure lineage, reproducibility, and scalability.

- Establish validation/data quality systems to maintain consistency and integrity across all datasets.

Requirements

- 2+ years of professional experience as a Data Analyst, ideally within a cybersecurity or ML-focused environment.

- Proficient in Python for data manipulation, analysis (Pandas, NumPy), and automation of data workflows.

- Strong experience with web analysis tools (e.g., Selenium, BeautifulSoup) and a thorough understanding of HTML and DOM structures for data extraction and preprocessing.

- Familiarity with natural language processing (NLP) techniques including tokenization, stop word removal, and lemmatization for text data preparation.

- Experience with generating synthetic datasets and utilizing LLMs for enhancing machine learning data.

- Ability to effectively collaborate with ML Engineers and other technical teams.

- Strong problem-solving capabilities and a meticulous approach to maintaining data quality and governance.

- Familiarity with cloud platforms (AWS, GCP, Azure) for data storage and processing.

- Bachelor’s degree in Data Science, Statistics, Computer Science, or a related field, or equivalent experience.

- Due to involvement in GovCloud, all applicants must be a US Person.

Benefits

- Medical, Dental & Vision (Inclusive of domestic partnerships)

- Employer Paid Life Insurance & Employee/Spouse/Child Supplemental life

- Voluntary Short/Long Term Disability Insurance

- 401k (Roth/Traditional)

- A generous PTO plan that acknowledges your commitment and seniority (including paid Bereavement/Jury Duty, etc.)

- Above market annual bonuses

Postuler maintenant

Plus d'emplois