We are launching two data challenges to discover latent patterns within mobile data (Challenge1) and wind energy sources (Challenge2) !!
Any undergraduate student can participate. Alone or in a group (max 3 students).
The winners of the two challenges will participate in the Final SCAVENGE Workshop & Exhibition event (more info soon)

Datasets will be provided by SCAVENGE (in csv format) to participants.
Expected material to be delivered:

  1. Written report detailing the explanation of the methodology, the results and their discussion.
  2. Source code developed to analyse the data.

Submit you Project Proposal to the SCAVENGE Coordinator, including a short abstract (1 page) with:
– title,
– author(s),
– affiliation (University),
– expected results,
– methodology.

IMPORTANT DATES:
Project Proposal (abstract): 15/01/2019
Notification of Acceptance: 31/01/2019
Contest Start: 01/02/2019
Final Project Due: 31/05/2019
Winner Announcement: 15/06/2019

Challenge1: Mobile Data

We provide a dataset containing traces from the 4G control channel of 3 different cells. Data includes one week of observations of the mobile users both uplink (UL) and downlink (DL). Specifically, the dataset consists of 4 columns containing: Timestamp, Temporal User Identifier, Transmission direction (UL/DL) and Modulation and Coding Scheme (MCS) Index (from 1 to 29).

The student should use state-of-the-art machine learning algorithms to process this data to:

  • Perform temporal comparisons and classify the users based on observed MCS pattern;
  • Implement a temporal predictor for the MCS of the different user classes and provide a detailed analysis of its accuracy;
  • Analyze the channel quality experienced by users and try to infer their mobility patterns.

References:
[1] 3GPP TS 36.213 version 8.3.0 Release 8: Evolved Universal Terrestrial Radio Access (E-UTRA); Physical layer procedures
[2] Mohammad T. Kawser, Nafiz Imtiaz Bin Hamid, Md. Nayeemul Hasan, M. Shah Alam, and M. Musfiqur Rahman, Downlink SNR to CQI Mapping for Different Multiple Antenna Techniques in LTE, International Journal of Information and Electronics Engineering, Vol. 2, No. 5, September 2012

Challenge2: Wind Energy

We provide a dataset containing hourly wind energy capacity factor values for different European areas. The capacity factor is the unitless ratio of an actual electrical energy output over a given period of time to the maximum possible electrical energy output over that period. The dataset contains 30 years of hourly observations, starting from January 1st, 1986. The European geographical area is divided up considering the second-level Nomenclature of Territorial Units for Statistics (NUTS-2). The dataset is composed of 262968 rows (one for each hour) and 255 columns (one for each NUTS-2 area).

The student should use this data to:

  • Perform spatial and temporal comparisons of the areas in terms of capacity factor, including:
    – geographic diversity (i.e., how areas spatially differs in terms of capacity factor)
    – capacity factor correlation (i.e., how areas are correlated)
  • Provide a clustering of the European geographical area in terms of wind energy capacity factor by considering state-of-the-art clustering algorithm.
  • Implement a temporal predictor of the capacity factor (e.g., based on machine learning or other mathematical tools) and provide a detailed analysis of its accuracy.