Machine Learning-Based Traffic Control System in Uganda

Sudi Murindanyi
5 min readJun 22, 2021

--

Introduction

The project investigated the problem of traffic jam at one of the junctions in Kampala Uganda which is Wandegeya junction. At Wandegeya junction, existing traffic light control causes long delays, air pollution, energy waste, accidents, and many other problems. The government of Uganda through Kampala Capital City Authority (KCCA) has tried to solve this problem using different technologies like radar but did not help much. The project studies the traffic signal’s duration based on the data collected manually by counting cars and data from KCCA. The machine learning-based model was developed to control the traffic light(agent). Q-learning was used, it is a model-free reinforcement algorithm. Q-learning learned the actions of the agent and powers neural network to predict better actions to take. The model was evaluated via Simulation of Urban Mobility (SUMO) in a vehicular network, and the simulation results showed the efficiency of this model in controlling traffic lights.

About the data used

Manual Count of the cars

The data was collected using the manual traffic counting method of the volume of vehicles per hour of which count required physical site visits. Several types of surveys were conducted including analyzing the existing signal operations and the physical layout of the junction. The main objective for the manual count was to collect quantitative data which involved observation and recording of data for a given period of time. The focus was mainly on morning traffic(High traffic entering town), evening traffic(High Traffic leaving town), daytime traffic(High Symmetric), and weekend traffic(Low Symmetric). The data was collected for the interval of 15 minutes for 8 hours for 10 days then the 15 minutes data intervals were converted to 1 hour by adding 4 times of the 15 minutes to find the traffic volumes of vehicles. To achieve the counts, the different options were designed in order to minimize the problem of delays, and unsafe maneuvers, traffic engineering manual for traffic signal designs. The options included protected phase plans, split-phase plans, and the addition of exclusive lanes. Due to large vehicle volumes that required split phasing to some legs, though the space accommodates separate protected phasing is limited since there is always a need for an exclusive lane for go-through vehicles. Wandegeya Junction had a limited space to allow increased lane numbers, thus split phasing was adopted as the best option for this junction, and the geometric layout was extracted from Google Earth.

Automatic Count

The data was provided by KCCA from the cameras, Sensors, and radars, The method of collecting that data is called automatic count. The automatic count method provides a means for gathering a large amount of traffic data. They are usually taken for each 24-hour, the count may extend for a week, month or year. When the counts are recorded for each 24-hour time period, the peak period can be identified.

Machine Learning Model

The model was developed and trained using a Q-Learning algorithm and neural networks. In Training of the neural network, the input is the vector IDR representing the states, while the desired output is the updated Q-values Q(s, a) that now includes the maximum expected future reward, thanks to the Q-value update equation. By doing this, the next time the agent encounters the state s or a similar one, the neural network will be likely to output the Q-value of action that is comprehensive of the best future situation.

Finally tested the model in SUMO. In a SUMO simulation, 1 step is equal to 1 second. For the model, one episode consists of 5400 steps, which translates to 1 hour and 30 minutes of simulation.

Results

The agent was trained using a traffic micro-simulator. It was trained by submitting multiple episodes which consist of traffic scenarios, where it gather experience from. One episode consists of 5400 steps, which translates to 1 hour and 30 minutes of traffic simulation. With the use of experience replay, the training phase has advantages of correlations in the observation sequence and refreshes the experience of the agent. In the early stages of the training, the agent does not know which actions are the most valuable ones. In order to overcome this problem, at the beginning of the training, the agent should discover the consequences of the actions and does not have to worry about the performance. Once the agent has a solid knowledge about the outcomes of the actions in a significant variety of states, it should increase the frequency of exploitative actions in order to find the most valuable ones and consequently increase the performance achieved in the task.

Conclusion

Data was collected using manual counts and automatic counts, then analyzed the data and designed the traffic control system using Q-learning and neural network. The Q-learning agent was implemented in the context of signal traffic control in order to investigate the efficiency improvement while maintaining a significant degree of reality. The learning agent was designed with a state representation that identifies the position of vehicles in the environment, an action set defined by traffic light configurations with a fixed duration and two reward functions that capture with different magnitudes the difference of vehicles waiting times between actions. In particular, the elements of the agent are designed to be plausible to implement with respect to a possible real-world appliance. The learning approach applied for the agent’s training is the Q-learning equation combined with a deep neural network. The Q-learning is used for the update of the action values as the experience of the agent increases and the neural network is employed for the Q-values prediction and, therefore, the approximation of the state-action function. A traffic micro-simulator SUMO was used to replicate a 4-way intersection with multiple lanes, and to reproduce various traffic scenarios with different traffic distributions. The reward was calculated based on the simulated waiting time of vehicles, making the agent aware of the consequence of his actions in different situations. Results indicate that the proposed agent can adapt to several traffic situations and is able to outperform the static traffic light system in situations of low and medium densities.

Demo

Follow me on Medium for more Machine Learning Projects stories I did. Thank You.

--

--

Sudi Murindanyi
Sudi Murindanyi

Written by Sudi Murindanyi

Experienced Research Assistant with a demonstrated history of working in the Data Science and Machine Learning industries.

No responses yet