• Sodertalje


  • Data och IT
  • Informationsteknik

Ansök senast: 2021-11-30

30 credits – Efficient algorithm development for GPUs

Publicerad 2021-10-04

Autonomous vehicle development at Scania is advancing at a very high pace and self-driving trucks and buses on public roads will soon see the light of day. Autonomous Transport Solutions (ATS) Research at Scania is responsible for developing, testing and piloting future frontier ATS concepts. This work is done using agile and self-steered teams with the ambition to detect and evaluate upcoming technologies and prepare these for industrialization. We work in close cooperation with Volkswagen Group Innovation, leading technology suppliers and academic institutions.

This thesis work will lie under the supervision of the AI team, called EARA in Scania lingua. The group EARA develops deep learning-based methods that are used in the scene perception.

You will work closely with the members of this highly competent multicultural team, instrumental in developing cutting-edge autonomous technologies where your ideas will be encouraged and embraced.

Today we must utilize general purpose GPUs to have real-time behaviour of very data intensive and computation heavy algorithms. Algorithms on a GPU are normally part of a data flow involving deep neural networks as well as so called kernels.

In real-time systems, most of the data flow is implemented by programmers using imperative programming languages, e.g., C and C++. The main reason is that the underlying computational model is packaged and distributed from, e.g., Nvidia, as C header files and libraries where these libraries export functions for working with a split architecture: host and device and different kinds of memories. As a programmer you have to be very knowledgeable in how this architecture works in order to write highly efficient programs.

We all know that imperative programming with side-effects puts quite a burden on the programmer to make an efficient program without introducing too many bugs. For instance, using the Nvidia computational model, you are exposed to raw memory pointers, different types of pointers (host and device), and need to handle concurrency, on the host using threads, and on the GPU using streams, explicitly. Wouldn't it be nice if one could use an abstracted computational model where these details were hidden, but the performance still as good?

Consider a tracking algorithm in image space, where the input is object detections from an object detector, and the output is tracks. A baseline algorithm is to form intersection over union of object detections and tracks and then assign detections to those tracks with best matches. This is an example of the linear assignment problem solved by, e.g., the Hungarian algorithm. It is not hard to implement such an algorithm in C or C++, but then it will be strictly a CPU bound algorithm unless some parts are explicitly moved to the GPU.

This thesis' problem formulation is as so:

  • Which computational models, programming languages, or programming language with library, can describe a tracking algorithm and get a highly efficient executable that can utilize available GPU resources?
  • Implement a tracking algorithm (for instance the above baseline algorithm) and measure run-time and resource utilization and compare that to Scania’s tracking algorithms?
  • Also reason about which computational model will give the best tracking algorithm from a software engineering perspective, i.e., maintainability, extensibility, readability, etc.

One programming language that most likely will be part of this thesis is Futhark. You can check it out.

The successful applicant will have the opportunity to gain hands-on experience of working on the latest sensors, computing platform and Scania’s concept autonomous vehicles. The applicant will also have access to the knowledgeable researchers and developers working at Scania’s Autonomous Transport Solutions Pre-Development & Research department.

We are open to the exploration of innovative ideas and if feasible the applicant might also get a chance to submit her/his results to a reputable research conference or even submit a patent application.

One thesis worker studying a master's program in Computer Science. Applicants are expected to have a good understanding of computer vision, GPUs, and programming language theory. The applicant should be able to work in a diverse environment and communicate effectively in English. The personal traits of being agile, giving/receiving constructive feedback and taking initiatives will come handy.

Time plan
The project is planned for 20 weeks and can be started any time in early Spring 2022.
Applicants will be assessed on continuous basis until the position is filled.

Communication of the results
The results will be described in a report, published on Scania’s internal web and by the applicants’ university, and be shown in presentations at Scania. The prototype tool will be made available to Scania.

The project will be performed within Scania’s Autonomous Transport Solutions Pre-Development & Research department.

Thomas Gustafsson, PhD, Expert Engineer, AI Technologies, Autonomous Transport Solutions,