We are looking for an outstanding Data Engineer to join our investment team to build out our big data analytics pipeline.
Design and build the data pipelines
Import structured and unstructured data from the vendors, clean and store the data
Build tools that will help automate the cleaning process of the data
Answer companywide questions about the nature, completeness and correctness of the data, using judgement and primary sources to determine whether on-the-fly updates are warranted.
On-board new data sources and maintain external relationships.
Work to integrate new data sets into the existing database structure. Plan upload protocol and sanity checks to ensure data integrity.
It's critical that the candidate will bring a CS view to the process of data management and a strong experience in systems design is desired
If selected, the Data Engineer will have significant responsibility in extracting signals from structured and unstructured data, and in transforming them into tradeable insights.
BS, MS, or PhD in Computer Science, Electrical Engineering, Statistics, or related discipline
Proficiency in Python and analytic packages such as Numpy, Pandas, Scikit-Learn
Proficiency in SQL
• Processing big data with Python using Spark, Dask, Blaze
• OLAP with AWS Redshift, Google BigQuery
• ETL orchestration with Apache Airflow, AWS Glue, or Google Dataflow