Stagiair project in Weak Supervision

The Wholesale Banking Advanced Analytics team is a large team of data scientists, data engineers, software developers and many more, that are focused on bringing data, machine learning and statistical modeling into the products that we build for our clients or internal users.

The data scientists in WBAA furthermore have a strong desire to keep up with and be part of the latest developments in the fields of AI, tooling and statistics. Which they do by working closely together with master’s students on a variety of topics to solve academic yet practical problems.

Wat ga je doen?

Supervised machine learning relies on the availability of large quantities of labelled data, which is often unavailable in practice. Manual annotation is time consuming and costly, especially when domain experts are required.

The Data Programming paradigm combines multiple noisy labels to learn probabilistic labels that can be used to supervise machine learning models. Instead of manually labeling each example, here the annotator provides labeling functions (LF) that noisily label the data (for instance using domain heuristics, external knowledge bases).

Data Programming through the popular Snorkel framework has been widely used in industry (see snorkel.org). However, the Snorkel approach has two unknown dependencies (the dependency structure of the weak labels as well as the class balance) that are of importance and not trivial to find, especially for unbalanced problems, such as fraud detection, or when highly correlated LFs are provided. The recently proposed End-to-End Weak Supervision claims to be robust in both situations, however more research is needed to understand how well this method can be applied in practice. During your thesis work you will experiment with, validate and improve on these methods.

Our team has extensive experience with weak supervision. A previous thesis internship at ING from the UvA led to the publication of a paper on the topic.

Wat krijg je er voor terug?

Then we offer a master thesis project, a compensation of 600 euros per month, close supervision, and a tight community of data scientists to interact with and learn from.

Wat ga je doen?

Supervised machine learning relies on the availability of large quantities of labelled data, which is often unavailable in practice. Manual annotation is time consuming and costly, especially when domain experts are required.

The Data Programming paradigm combines multiple noisy labels to learn probabilistic labels that can be used to supervise machine learning models. Instead of manually labeling each example, here the annotator provides labeling functions (LF) that noisily label the data (for instance using domain heuristics, external knowledge bases).

Data Programming through the popular Snorkel framework has been widely used in industry (see snorkel.org). However, the Snorkel approach has two unknown dependencies (the dependency structure of the weak labels as well as the class balance) that are of importance and not trivial to find, especially for unbalanced problems, such as fraud detection, or when highly correlated LFs are provided. The recently proposed End-to-End Weak Supervision claims to be robust in both situations, however more research is needed to understand how well this method can be applied in practice. During your thesis work you will experiment with, validate and improve on these methods.

Our team has extensive experience with weak supervision. A previous thesis internship at ING from the UvA led to the publication of a paper on the topic.

Wat krijg je er voor terug?

Then we offer a master thesis project, a compensation of 600 euros per month, close supervision, and a tight community of data scientists to interact with and learn from.

Wie is ING

Positieve veranderingen realiseren door onze expertise, middelen en maatschappelijke positie op de juiste manier in te zetten. Zo bouwen wij voort op een prachtige geschiedenis. In 1881 begon ING mensen te helpen met sparen. We waren de eerste bank die kleine bedrijven vooruit hielp. En dat is waar onze passie ligt, met een toegankelijke, positieve instelling mensen vooruit helpen. Overal en altijd.

De weergegeven reviews zijn van oud stagiairs.
Nog geen reviews beschikbaar

Vragen over deze vacature?

Team Mediastages