Data Engineer

Make the industry more sustainable and help write our company’s future

The Challenge

Software runs the universe and our developers write our company’s future. That’s why we are looking for a Data Engineer to take Sensorfact to the next level. Our customer base is growing, and so are our data volumes. Your challenge is to empower our data scientists, who develop forecasting and pattern recognition algorithms fed by the 20 million energy measurements we get each day. Can you design and build an infrastructure and scalable data pipelines to train machine learning models in distributed fashion? Are you excited about building streaming applications that can alert customers in real time of energy waste and machine failure? Your work will have a massive impact on the company: gaining insights from energy consumption data is the core of our value proposition.

What you will be doing

  • You will be responsible for transforming our savings algorithms and machine learning models into production-ready applications to create lasting value.
  • You have the opportunity to work with state of the art tools for machine learning operations (MLOps), serverless and event-driven architecture and cloud services in AWS.
  • We are moving our extensive savings algorithm toolkit to a scalable microservices architecture. You will play a central role in designing and implementing our plan of attack.
  • You will help to set up a platform to run machine learning experiments at scale, having the right data and compute available without impacting our ingestion, run customized models for our customers, and providing our domain experts with insightful tools.
  • You will work closely with our Data Scientists to design robust and production-ready machine learning pipelines that continuously generate insights across our customer base. Additionally, you will work with our Backend and DevOps colleagues to ensure stable and scalable systems.
  • Being part of a scale-up, you are proactive in prioritizing and solving the needs of our fast growing group of customers.

The key technologies you will be working with

As we are scaling up our platform with a small team, we leverage new technologies to keep performance and productivity. Right now our core platform is based on microservices written in Node.js connecting to the NATS message bus. Data is accessible through GraphQL APIs managed by Hasura. Time series data is stored raw in MongoDB, processed in InfluxDB and Postgres is our workhorse. Data analysis code is written in Python. We use Jupyterhub to experiment and interact with analytics models and present them to our in-house energy consultants. Our source code is on GitLab and we use a mix of GitLab CI and Jenkins for CI/CD.

How we do it

We do Scrum with 2-week sprints, sprint planning and retrospective sessions. Our stand-ups are at 9:30 and if you’re not there you can chime in over Meet. We keep track of things using Linear, Google Drive and Outline, and we stay in touch with each other over Slack. The course is determined by quarterly goals, set collaboratively by business, data science, development and product teams.

We know how important it is to get in the zone and write beautiful code so we schedule most meetings in the morning and keep the afternoon quiet (we try). We work from home about 70% of the time, but we enjoy meeting each other in the office regularly – covid allowing of course.

You are perfect for this job, because you…

  • Have an MSc (or PhD) in Computer Science, Distributed Systems, Artificial Intelligence, or a comparable analytical / technical field;
  • Are a medior (3+ years) data engineer who is fluent in creating cloud-based software applications with Python;
  • Are fluent in professional software engineering practices (version control, merge requests, testing, code standards, CICD);
  • Have experience with modern cloud and data technologies such as Spark, Kafka, Kubernetes, Docker, Jenkins, AWS Lambda;
  • Have experience with deploying and maintaining data and machine learning pipelines at scale;
  • Are passionate about one of the following (the more the better!): serverless and event-driven architectures, machine learning operations, stream processing, saving our climate, scale-up life;
  • Have knowledge of modern database systems, preferably MongoDB, Postgres and/or Time Series databases (InfluxDB);
  • Are fluent in English;
  • (Bonus) Have knowledge of statistics and machine learning frameworks such as Tensorflow and Scikit-learn.

What we offer

A fulltime position (32-40hrs), money, pension, lunches, working from home, team activities, training budget – the usual. We work in a forward-thinking start-up culture with an energetic and engaged team, located around the corner of Utrecht Centraal. We’ll provide you with an NS-business card or cover your travel expenses to get there. We know how incredibly important it is to have the right tools. Any hardware or software you need to get your job done: great monitor, the best laptop, standing desk – you’ve got it.

About Sensorfact

Our mission is to reduce energy waste in industrial companies. We do this by making energy saving easy. Therefore, we have developed a plug & play Energy Management System that consists of wireless sensors and a clear online platform. Our algorithms analyse the data and detect potential energy savings. This way we help our customers to reduce their energy bill by 5-10%.

Do you see yourself working at Sensorfact?

Apply now!

If you are an analytical talent and see yourself joining a high-growth, fast-paced scale-up, we would love to get in touch. If there are any questions, please feel free to reach out to us.

Apply

We will get back to you
as soon as possible

    Please upload both CV and motivation letter:
    check I agree to the Sensorfact privacy policy.

    Get in touch

    Directly contact our recruitment team