Manuscript on analyzing Approximate SGD

Bin Hu and I have recently posted a manuscript on arxiv entitled “Analysis of Approximate Stochastic Gradient using quadratic constraints and sequential semidefinite programs”. Approximate Stochastic Gradient refers to the standard SGD algorithm, which is widely used in solving optimization problems, but with the twist that gradient information is noisy.

In this paper, we investigate the case where noise is either additive, multiplicative, or both. We also consider different cases concerning the objective functions. For example, the case where each component function is smooth (not necessarily convex) but the sum of the component functions is strongly convex. Generally, our approach allows one to specify different constraints on the individual functions vs. the sum of the functions and obtain worst-case (and average-case) performance guarantees. Each case reduces to analyzing the feasibility of a small semidefinite program that can be evaluated numerically or analytically.

Tuning the stepsize presents interesting challenges, because you generally have to choose between converging rapidly to a large ball or converging slowly to a smaller ball. We investigate what happens when using a constant stepsize and we also look at the optimal case, i.e. optimally tuning stepsize at each iteration by solving a Bellman equation. This approach gives insight into the intricate trade-offs that couple stepsize selection, convergence rate, optimization accuracy, and robustness to gradient inaccuracy.

CMO-BIRS workshop: Beyond Convexity

I attended a week-long workshop titled “Beyond Convexity: Emerging Challenges in Data Science”, hosted by the Casa Matem├ítica Oaxaca (CMO) in Oaxaca, Mexico.

The workshop consisted of talks, breakout sessions, and many discussions on topics including semidefinite programming, nonlinear/nonconvex optimization, deep learning, and statistics. Much time was spent brainstorming about unsolved problems and discussing emerging topics in data science. The combination of a beautiful and secluded venue, and a small size (roughly 30 attendees) led to many thought-provoking discussions. I returned to Madison with new knowledge, new ideas, and new colleagues. Couldn’t ask for more!

As part of the workshop, I gave a 30-minute talk where I presented recent work by Bin Hu and myself on using dissipativity theory to analyze and interpret the convergence properties of optimization algorithms. A video of my talk is available here and my slides are available here.

I’m grateful for the hard work put in by the organizers: Tamara Kolda (Sandia National Labs), and my colleagues at UW-Madison: Rob Nowak, Becca Willett, and Stephen Wright. Bravo! The photo above is a panorama taken at Monte Alb├ín, one of Oaxaca’s most famous archaeological sites.

Allerton’17

I attended the 55th annual Allerton Conference on Communication, Control, and Computing in Monticello, IL. On the right is a photo of the view from the conference venue. You wouldn’t guess that this is in Illinois! At the conference, I presented an invited paper by my student Akhil Sundararajan and myself entitled “Robust convergence analysis of distributed optimization algorithms”. The paper describes a methodology based on semidefinite programming that allows one to efficiently analyze a variety of recently proposed distributed optimization algorithms (in the sense of bounding the worst-case convergence rate). The benefit of our method lies in its versatility. Convergence analyses typically require customized approaches for each algorithm, whereas our method is flexible and can be broadly and automatically applied. We present two methods: one for obtaining graph-dependent performance bounds and one for obtaining robust bounds that hold over the set of graphs with a given spectral gap. Slides from my talk are available here.

ICML’17 in Sydney, Australia

I attended the 2017 International Conference on Machine Learning (ICML) in Sydney, Australia. ICML is one of the flagship machine learning conferences. Since machine learning is a very applied field, there was a considerable industry presence, including all the big-name tech giants: Google (Brain, DeepMind), Microsoft, Facebook, Amazon, NVidia, etc. Machine learning isn’t my area of expertise, but it was nonetheless interesting for me to step out of my comfort zone and hear about the state of the art in this very popular field. As one might expect, deep learning and reinforcement learning were prominently featured, and there were some very impressive demonstrations. It looks like there is still much work to be done on the theory side of deep learning, however; the next five years should be quite exciting!

At ICML, I presented joint work with Bin Hu and myself on using energy-based methods (dissipativity theory) to understand and interpret why and how iterative algorithms converge. The key idea is that iterative algorithms are dynamical systems. So if we define the appropriate notion of internal energy, we can write a conservation law (as we do in physics). The rate of dissipation of internal energy is then equivalent to the rate of convergence of the algorithm to its fixed point. If you’re interested, you can download my slides, my poster, or the paper itself.

This was my first time visiting Australia and my first time in the Southern hemisphere too! The photo above is a panorama taken from a ferry looking back at Darling Harbour. In the photo, you can see the iconic Sydney Tower Eye, the Sydney Opera House, and the Sydney Harbour Bridge.

Focus period at Lund University

I attended the LCCC Focus Period on Large-Scale and Distributed Optimization at Lund University in Sweden during the month of June, 2017. The focus period consisted of a three week stay and a three-day workshop on the topic of large-scale and distributed optimization. Focus Periods are a unique opportunity for researchers to come together, collaborate, and exchange ideas, but without the time constraints of a standard conference.

I did a one-year postdoc at Lund University right after I completed my Ph.D., so it was nice to return to Lund and see all the familiar faces. Lund is also a beautiful place during the summer! I had a wonderful and productive time in Lund. Made new friends, forged new collaborations, and caught up with many of my former colleagues. Since the focus period was hosted by the Department of Automatic Control, it was a unique opportunity for controls and optimization researchers to come together.

During the workshop, I presented my ongoing work on using robust control tools to analyze and synthesize optimization algorithms. The abstracts and slides for all the talks during the workshop (including mine) are available here.

SIAM Conference on Optimization in Vancouver

I attended the SIAM Conference on Optimization in Vancouver, Canada. This was a memorable conference for several reasons. First, this conference only happens once every three years. Its the flagship conference for optimization research, and it was also my first time attending a SIAM conference! Finally, as a Canadian, it is always nice to visit my home country. I also have family living in Vancouver so that was a bonus!

At the conference, I presented my work on using robust control for algorithm analysis and optimization. This talk was part of an invited session titled “Robustness and dynamics in optimization”, organized by Ben Recht and Pablo Parrilo. It was a reprise of our successful session at ICCOPT 2016. If you’re interested in seeing my slides, please refer to my ICCOPT slides as the talks were quite similar.

Above, I included a picture of a Nanaimo Bar from the cafe across the street from the conference venue. This is a uniquely Canadian (and delicious) dessert that is very hard to find in the US. It owes its origin to the city of Nanaimo, British Columbia, which is on Vancouver Island (a short ferry ride away from the city of Vancouver).

ACNTW workshop 2017 at Northwestern

I attended the 2017 ACNTW workshop (workshop on optimization and machine learning) hosted by the Center for Optimization and Statistical Learning at Northwestern University.

This workshop brought together optimization researchers from the midwest for a one-day workshop on a variety of topics including matrix completion, nonconvex approaches, and machine learning. Although most of the talks were from academics, there was also a couple talks and posters from researchers at Argonne National Laboratory. There was also a healthy representation of optimization researchers from UW-Madison! It was an excellent workshop — looking forward to the next one!

This was my first time visiting downtown Chicago and, perhaps fittingly, Google took one of my photos, automatically stylized it, and suggested that this might be a nice photo to keep as a reminder of my trip! Machine learning at work! (see photo above)

Smart Urban Infrastructures Workshop at MIT

I attended the Smart Urban Infrastructures Workshop presented by LIDS at MIT. The idea was to bring together researchers and leaders from academia and industry for a series of short talks and panel discussions. The topics covered several types of “urban infrastructures”, including:

  • Ride-sharing platforms: optimization, control, scheduling, and management (car sharing, bike sharing).
  • Autonomous vehicles: challenges in robotics, coordination, safety, and accountability.
  • Privacy and security in the age of the “internet of things”.
  • Power grid: integrating uncertain renewable generation.

It was nice to have a mix of voices from both academia and industry in the panel discussions. As an academic, one can easily be isolated from the “real world”, so I appreciate the diversity. For example, we heard from Nicholas Chamandy (head of data science at Lyft) discuss the intricacies of the large-scale optimization problems faced by a data-driven ride-sharing company. We also heard from Andrew Therriault (chief data officer of the city of Boston) discuss the challenges in managing the smart infrastructure of a major US city.

Looking forward to attending more workshops of this sort in the future! Pictured above is the Stata Center at MIT, one of the coolest-looking buildings I’ve ever seen!

Akhil wins Teaching Excellence Award

Congratulations to my student Akhil Sundararajan for winning the 2017 Gerald Holdridge Teaching Excellence Award! This annual award recognizes top teaching assistants in Electrical and Computer Engineering at UW-Madison.

Akhil received the award at the ECE department’s annual Spring celebration. This year’s celebration coincided with the ECE department’s 125th anniversary, so Bucky Badger made an appearance to present awards! Pictured on the right is Akhil receiving the teaching award from Bucky. Congrats Akhil!

6th Midwest Workshop on Control and Game Theory at Michigan

I recently attended the 6th Midwest Workshop on Control and Game Theory hosted by the University of Michigan. The two-day workshop brought together faculty, postdocs, and grad students in the fields of controls, optimization, game theory, and economics. This was my first time visiting Ann Arbor; it’s truly a beautiful city (see photo on the right!).

My talk was about using tools from robust control to analyze the performance of iterative algorithms. It’s one of my favorite topics to talk about and I’m grateful for the opportunity to present my work to such a diverse crowd! My slides are available for download here.

Congratulations to Vijay Subramanian, Dimitra Panagou, Necmiye Ozay, and the other organizers for a stellar workshop. I’m looking forward to the 7th edition of this workshop next year!