These are subject to change.

Deadline

The final submission of your hackathon efforts is due at 3pm on Sunday, May 23, 2020. We will be providing a Google form for you to fill it out. This deadline will be enforced quite strictly, and we will not be accepting any more submissions after the deadline. You will be allowed multiple submissions, in which case, your last submission will be evaluated.

What to Submit?

We are expecting primarily two types of submission (every team has to submit these two).

  1. Presentation: We will require a approxiamtely 5-minute long video presentation of the results of your efforts (hard cut-off at 10-minutes). The presentation should include the team name/member(s), the motivation of the work, the main technical content of your contribution, and a presentation of the results you obtained. You can use this opportunity to give a demo, or show charts, or any other way you want to highlight your submission. The easiest way to record such a video is to get the whole team on a Zoom call, and record the meeting to file. We may make this video available publicly. You will have the option of submitting a video file or a YouTube link.

  2. Source code: We will also require submission of the complete source code of the work (including notebooks, etc.). We are not restricting the language or the libraries, and are not going to execute it at our end. At least some light documentation of your code is recommended - its good practice to document as you go - undocumented code is of little value to anyone. Also, maintain a list of references to external code and snippets that you are using in your project (attributions.txt in the root folder of your code archive). We are not going to be sharing this code publicly without your consent, but if you are concerned, please include a LICENSE file with the code. You will have the option of submitting an archive (zip or tar.gz, etc.) or link to a public Github repository.

Evaluation Criteria

We will be evaluating each submission to the hackathon on the basis of the following criteria (in no specific order):

  1. Relevance of the Problem: For the dataset that you have chosen, how interesting is the problem that you are focusing on? In other words, even if you solve what you are aiming for perfectly, how useful would it be? The best way to ensure this is the case is to get to know your domain very well, and work with the dataset owners to clarify their needs to you.

  2. Scientific Insight: An important part of building models for scientific data is providing insight and understanding, not just black-box predictions. Can you provide examples of insights gained from your model? what does it tell you (and the domain expert) about the underlying phenomena? Does your model support new hypotheses? Can you determine which variables are more relevant than others for prediction - and how do these insights line up with known theory for your domain? Can you visualize your results in a meaningful way (PCA, tSNE, etc) that helps the domain expert understand the model?

  3. Evaluation: Evaluating a model is much than just providing an accuracy or RMSE number in a table. For example, including good baselines for comparison is essential: how would a simple model do on this problem? Reporting multiple metrics can provide insight into model performance, e.g., metrics like precision/recall and calibration can often be very useful rather than just accuracy for classification problems. Diagnostics and error analysis can also provide significant insight: can you characterize what types of inputs causes a model to make errors? What are the weaknesses of your model? do all models make errors on the same types of examples
  4. Overall Creativity: This is an overall measure of aspects of the project that we found interesting or surprising. This applies not only to your use of a unique machine learning technique, but applies could also be a unique problem formulation, visualization of the data, evaluation metric, or use of existing tools.

  5. Technical Strength: How technically sound and involved is your work, in terms of the sophistication of the machine learning approaches used, or the overall performance of the approach. It is important to know reiterate that this is only one of the criteria, so try not to focus solely on this.

  6. Presentation: How good is your presentation video in communicating the problem setup, the motivation, the solution, and the results of your work? This will focus more on the informativeness and clarity of the presentation, rather than aesthetics or flashiness. The best way to ensure your presentation meets our criteria is not to leave the presentation to the last day, but instead, work a little bit on it as things start finalizing.

The projects will be judged by a group of machine learning researchers and faculty, along with the dataset owners who are domain experts in the field.