• Big Data for Public Transport
  • 2019 – 2021
BigData4PublicTransport
Big Data for Public Transport aims to develop a new technology able to process data collected from smart payment systems and combine it with other data sources in order to analyse travel behaviour, generate mobility indicators and assist public transport authorities and operators in the planning of public transport systems.

The project

BD4PT is a research project funded by the Spanish Ministry of Economic Affairs and Digital Transformation and the European Regional Development Fund (ERDF). The project aims to develop a new technology able to process data collected from smart payment systems and combine it with other data sources (vehicle positioning, land use information, anonymised mobile phone data, etc.) in order to analyse travel behaviour, generate mobility indicators and assist public transport authorities and operators in the planning and operation of public transport systems.

Context

The planning and management of transportation systems requires accurate, reliable and up-to-date information on travel demand. The proliferation of geolocated data sources from mobile devices has brought new ways of studying people mobility and obtaining information on transport demand in a short period of time, and at a significantly lower cost than traditional methods, eliminating or mitigating many of the limitations of mobility surveys (high cost and time-consuming, small samples that impact the quality of the information, etc.). 

One of the most interesting data sources for the characterization of urban mobility comes from the smart payment systems used in public transport services. Each user leaves a record associated with the time at which it swiped the card and the line and / or stop used. An adequate analysis and fusion of this data with information on the public transport network, the offer of services and the location of the vehicles allows obtaining relevant information, such as the number of journeys made per day, their origin and destination, hours of trips, number of transfers, etc. This information allows public transport authorities to evaluate their policies and propose measures to improve the service. However, the exploitation of this data is still very limited due to the fact that, on many occasions, the transport authorities do not have the capacities needed to apply the big data techniques that allow to exploit this information.

Goals

BD4PT aims to lay the foundations for a future commercial solution that makes it easier for public transport authorities and transport operators to exploit these data sources to continuously monitor public transport demand. To do this, it has been proposed to investigate different improvements to a set of methods and techniques described in the state of the art and to implement and validate the new algorithms resulting from said improvements, both in the laboratory and in case studies carried out in collaboration with authorities and operators of public transport. 

The solution to be developed will analyze the records from smart payment systems in public transport and merge them with other relevant data sources, whether owned by public transport authorities (for example, transport network or land use) or are available by other means to provide information on mobility in public transport and other relevant statistics for the planning and management of such systems.

From a scientific-technical point of view, the project addresses different aspects for which a fully satisfactory answer has not yet been found:

1. Determination of the journey sequence of stages in public transport: for those systems where the trip it is only validated at the beginning of the journey, there is still room for improvement in the algorithms aimed at inferring the stop associated with the end of the journey and the identification of transfers between different lines or modes. The longitudinal analysis of the data and its fusion with other sources that provide contextual information about travel are central elements for a new approach to the problem;

2. Determination of the origin and destination of the trip: beyond the matrix of stops that is obtained by analyzing only the routes in public transport, the public transport authorities need to know the origin and destination of the trips associated with said routes. For this, it is essential to use other data sources that provide information on the presence and activity of the population in the areas of influence of the public transport network. In this sense, the challenge lies in integrating new data sources that can provide a dynamic measure of the presence of the population in a given area, for example, anonymized records from mobile phone networks and / or geolocated data from apps;

3. Expand the sample to the total number of public transport users: although the use of smart cards is widely extended among users of public transport, there is still a percentage of users who use tickets or means of payment that do not allow the travel traceability. Therefore, it is necessary to define a sampling expansion methodology that takes into account the available data on the sample universe.

BD4PT’s public deliverables

Órgano financiador - BD4PT
BD4PT, with file number TSI-100903-2019-37, is a project financed by the Spanish Ministry of Economic Affairs and Digital Transformation, and the European Regional Development Fund (ERDF).