Human migration patterns under different scenarios of sea level rise
with Bistra Dilkina and Juan Moreno-Cruz
- We couple sea level rise and human migration models to create a framework for studying the effects of sea level rise on human population distributions.
- We implement the framework using the radiation model of human migration and the Digital Coast SLR estimates.
- Our results show how the indirect effects of sea level rise (people living in areas that will experience large migrant influxes) can be much larger than the direct effects.
A machine learning approach to modeling human migration
with Bistra Dilkina
- We show how machine learning models of human migration can outperform traditional physics based models of human migration by using historical training data.
- Using socioeconomic data further improves the models.
- We test our models on migrations between counties in the US and migrations between countries globally.
Predicting and alleviating road flooding for climate mitigation
with Amrita Gupta and Bistra Dilkina
UN Data for Climate Action Challenge
- We created a computational framework to determine which roads in Senegal should be fortified against flooding in order to maximize accessibility under a fixed budget.
- We combined road network data, data about different flooding scenarios, and human mobility data to estimate how accessibility over the road network in Senegal would be affected under different flooding scenarios.
- We found that optimizing over different accessibility measures give nonuniform improvements to mobility in different parts of the country, and demonstrate the tradeoffs between the available budget for road network repairs and the overall benefit of the repairs.
Map of outgoing trips in Senegal provided by Orange
A deep learning approach for population estimation from satellite imagery
with Fred Hohman and Bistra Dilkina
Geospatial Humanities Workshop at ACM SIGSPATIAL 2017
- We train convolutional neural networks to estimate population counts from satellite imagery. More specifically, we train CNNs that take 1km^2 patches of Landsat 7 satellite images as input, then directly regress how many people live in the area covered by the input image.
- We train our models using dissagregated Census tract data for 2000, and test them on data from 2010.
- We validate our models both quantitatively, by aggregating their predictions to the county level and comparing to ground truth values, and qualitatively, through visualizations of confident predictions, large error locations, and activation maps.
The top 8 most confident prediction images from the test set for each class, all of which are correctly classified. Notice the types of images that appear from top (highways, few people) to bottom (buildings, many people) further indicated that our deep learning model is learning semantically-relevant features from satellite imagery
Machine learning approaches for estimating commercial building energy consumption
with Bistra Dilkina, Jeffrey Hubbs, Wenwen Zhang, Subhrajit Guhathakurta, Marilyn A. Brown, and Ram M. Pendyala
Applied Energy, Volume 208
- In this work we build machine learning models for estimating commercial building energy consumption, based on data from the Commercial Building Enegery Consumption Survey.
- We show that machine learning models which only rely on square footage, number of floors, principle building activity, number of heating degree days, and number of cooling degree days, can achieve good results for predicting the commercial building energy consumption.
- We validate our models on energy consumption data from New York City.
- We apply our models to the 20 county Atlanta Metropolitan area with data from the CoStar database, and show energy consumption estimates aggregated by traffic analysis zones.
Network optimization of food flows in the U.S.
with Arezoo Shirazi, Mengmeng Liu, and Bistra Dilkina
International Workshop on Big Data for Sustainable Development at IEEE BigData 2016
- In this project we used data from the Commodity Flow Survey and linear optimization models to determine how flows of food in the United States could be restructured more efficiently.
- We minimize the number of “food ton miles” in the network of flows of tons of food between states. In this problem we constrain the amount of outgoing and incoming commodities to be the same for each state, but to more efficient locations.
- We formulate the problem as a multi-objective optimization problem to find solutions that will be efficient but resilient.
An approach to integrate inter-dependent simulations using HLA with applications to sustainble urban development
with Ajitesh Jain, Bistra Dilkina, and Richard Fujimoto
Winter Simulation Conference 2016
- We help facilitate the coupling of seperately developed models by automatically generating HLA time management code.
- Specifically, our algorithm will use the SysML sequence diagram describing the data dependencies of the execution of some interdependent simulations to generate time management code that will automatically sequence the execution of the models in a way that their data dependencies are met.
- We use our automated code generation routine to create a joint model consisting of UrbanSim, MATSim, and TransitSim (a simple public transportation simulation we developed).
Optimization with integrated transportation and land use models
with Bistra Dilkina
- The goal of this project is to see if it is possible to influence where people live in an urban environment by changing the transportation networks with the purpose of achieving sustainability goals.
- We have coupled the recently released version of UrbanSim with MATSim to create a modeling framework in which to study this problem.
- I created several tools to visualize geographic and road network data to test the models.
Triangle densest k-subgraph problem with integer linear programming
with Bistra Dilkina
- Finding the Triangle Densest Subgraph of size k is a NP-hard problem that is useful for finding quasi-cliques in a graph.
- We are investigating finding and approximating hard instances of this problem with an Integer Linear Programming approach and comparing the performance against greedy heuristic based algorithms.
Vertex cover solvers (class project)
CSE 6140 - Algorithms, Fall 2015
- I implemented Branch and Bound and Simulated Annealing algorithms in Python to solve the Vertex Cover problem.
- Tested the above algorithms with algorithms that team members implemented on common graphs from networking literature.
Cellular automata networks for predicting weather
advised by Dawn Wilkins
- This was my undergraduate honors thesis project. I examined simulating climate variables with cellular automata models.
- I was interested if adding in long range connections to the cellular automata model could improve the accuracy of the model by learning the influences certain climate indicators (like El Nino) have on local weather.
- During the project I automated the training of over 10,000 neural networks on the Mississippi Center for Supercomputer Research’s cluster.
Automating measurements of space plants
with Josh Vandenbrink
- This was my Senior year capstone project. I was tasked with creating a framework that automated the data collection process from images of seedlings grown on the International Space Station.
- A lab in the UM Department of Biology received groups of 80+ images, showing the growth of up to 10 seedlings per image over time, then had to measure each seedling in each image by hand with a graphics program.
- I created a Python program that facilitated faster manual measurements and automatically performed OCR, perspective transformations, and image registration on these image groups to standardize the measurements as much as possible.
Satire detection (class project)
CSCI 517 - NLP, Spring 2014
- The objective of this project was to make an algorithm that could detect whether a text was satirical or not.
- I scraped a corpus of articles from satirical news websites and regular news websites, then trained several common classifiers on parts of speech, n-grams, and bag of words features on the corpus.
- Found that the classifiers achieved high accuracy by overfitting to the high number of quotations used in satirical news articles.
Face keypoints (class project)
ENGR 691 - Machine Learning, Spring 2014
- This problem was from the “Facial Keypoints Detection” Kaggle Competition, given an input image of a face, output where certain facial keypoints are.
- I attempted to use train a classifier to label the output from automatic keypoint detection methods from the OpenCV library.
- I also attempted a direct regression based approach using a neural network and local binary pattern images which performed better.
Sparse local binary pattern histograms for face recognition with limited training samples
with Jianxia Xue
ACM Southeast Regional Conference 2014
- This project was focused around the problems involved with automating classroom attendance using face recognition with a single training sample for each individual.
- I created a web application with a Python backend that performed online facial detection and recognition via a HTML5 webcam access.
- We examined improving the standard Local Binary Pattern Histogram approach for face recognition and using active learning to improve recognition accuracy.