• Big Data for Public Transport
  • 2019 – 2021

The BD4PT project

BD4PT is a research project funded by the Spanish Ministry of Economic Affairs and Digital Transformation and the European Regional Development Fund (ERDF). The project aimed to develop a new technology able to process data collected from smart payment systems and combine it with other data sources (vehicle positioning, land use information, anonymised mobile phone data, etc.) in order analyse travel behaviour, generate mobility indicators and assist public transport authorities and operators in the planning and operation of public transport systems.


The planning and management of transportation systems requires accurate, reliable and up-to-date information on travel demand. The proliferation of geolocated data sources from mobile devices has brought new ways of studying people mobility and obtaining information on transport demand in a short period of time, and at a significantly lower cost than traditional methods, eliminating or mitigating many of the limitations of mobility surveys (high cost and time-consuming, small samples that impact the quality of the information, etc.). 

One of the most interesting data sources for the characterisation of urban mobility comes from the smart payment systems used in public transport services. Each user leaves a record associated with the time at which it swiped the card and the line and/or stop used. An adequate analysis and fusion of this data with information on the public transport network, the offer of services and the location of the vehicles allows obtaining relevant information, such as the number of journeys made per day, their origin and destination, hours of trips, number of transfers, etc. This information allows public transport authorities to evaluate their policies and propose measures to improve the service. However, the exploitation of this data is limited due to the fact that, on many occasions, the transport authorities do not have the capacities needed to apply the big data techniques that allow them to exploit this information.


BD4PT aimed to lay the foundations for a commercial solution that helped public transport authorities and transport operators exploit these data sources to continuously monitor public transport demand. The project proposed to improve a set of methods and techniques described in the state-of-the-art and to implement and validate the new algorithms resulting from said improvements, both in the laboratory and in different case studies carried out in collaboration with authorities and operators of public transport. The solution developed analysed the records from smart payment systems in public transport and merged them with other relevant data sources — whether owned by public transport authorities (for example, transport network or land use) or available by other means — to provide information on public transport mobility and other relevant statistics for the planning and management of such systems.

From a scientific-technical point of view, the project addressed different challenges:

1. Determination of the journey sequence of stages in public transport: for those systems where the trip it is only validated at the beginning of the journey, there is still room for improvement in the algorithms aimed at inferring the stop associated with the end of the journey and the identification of transfers between different lines or modes. The longitudinal analysis of the data and its fusion with other sources that provide contextual information about travel are central elements for a new approach to the problem.

2. Determination of the origin and destination of the trip: beyond the matrix of stops that is obtained by analysing only the routes in public transport, the public transport authorities need to know the origin and destination of the trips associated with said routes. For this, it is essential to use other data sources that provide information on the presence and activity of the population in the areas of influence of the public transport network. In this sense, the challenge lies in integrating new data sources that can provide a dynamic measure of the presence of the population in a given area, for example, anonymised records from mobile phone networks and/or geolocated data from apps.

3. Expand the sample to the total number of public transport users: although the use of smart cards is widely extended among users of public transport, there is still a percentage of users who use tickets or means of payment that do not allow the travel traceability. Therefore, it is necessary to define a sampling expansion methodology that takes into account the available data on the sample universe.

BD4PT’s public deliverables


Conference talks

BD4PT, with file number TSI-100903-2019-37, is a project financed by the Spanish Ministry of Economic Affairs and Digital Transformation, and the European Regional Development Fund (ERDF). "A way to make Europe".