What’s Your City’s Score?

I am coming up to the end of my Lambda School journey. The last leg is a “labs” project where students in the web, mobile, and data science tracks build apps as a part of a cross-functional team simulating an authentic project. As a Data Science student, I joined just such a team of seven developers and our intrepid TL (thanks Sammy!) as our project steward.

The Problem & Customer

The primary role I played prior to Lambda School was as a software Product Manager. Sometimes this was in large companies and often in small companies and startups. For me, the first question to ask is what’s the problem we’re solving, and for whom are we delivering a solution. This real-world inspired lab project was no different.

Our customers/users are people researching a city location with the intent to move or visit. Their problem was how to easily size up a city and compare its “livability” across multiple geographic options.

A Possible Solution

I am a big fan of the concept of a business or lean canvas. A business canvas is typically a succinct document — ideally one page — that summarizes the key elements of a business or product. (Check out Ash Murya’s Lean Stack offering… highly recommended!)

Those key elements include sections for brainstorming & refining ideas regarding customers, problems, solutions, and our value proposition among other business drivers.

For me, identifying customers and problems is always a “chicken or egg” scenario. Those concepts need to be thought of together — not in isolation. That is customers have problems & those problems aren’t meaningful unless describable people live them.

At a high level, a solution would enable people to find cities they are interested in and dig into details that are central to understanding the city’s livability. A key to the solution will be to size up a city quickly without confusing the user with a wave of data or details. In other words, we want the user to be able to clearly see the forest rather than peer through a universe of trees. And finally, enable the user to compare that city’s forest (its comprehensible livability assessment) with other cities.

The Critical (Data) Path

After getting a grip on who our customer is, the problem they have, and our take on how we think they want to solve that problem, the next step was finding the data we’d need to fuel the solution.

We had varying success finding data associate with these domains of city life: rent, crime, “walkability” (can I get around without transportation), and air quality.

On a lean budget (well… $0 on top of our Lambda School tools) we found some raw data in these dimensions. That data typically described a point in time (e.g. YR2019) and was static in nature. That probably eliminated sophisticated predictive modeling — so a solid but not a great set of data sources.

Given our reasonable data expectations was the game up? To answer that question we harkened back to our first principles — what’s the problem we’re solving and for whom are we solving it? I think our prospective users want some good insight and guidance. If rent or crime diverged a bit from what a whiz-bang predictive model would produce, I don’t think it would matter. So we pushed onward!

What’s the Score?

We surmised our users want ease, brevity, and a frame of reference to compare multiple cities. With reasonable, not great data sources, we landed on the trusty 1–5 scoring spectrum to present the city’s livability.

Ah yes that 1 to 5 range was not invented by a surveying monkey in the late ’90s but has been a staple of the customer, user, student, parent, and voter study for what seems like forever. Boring… maybe. Effective… yes!

Armed with this tried and true approach we thought we could counter our concerns of less than stellar data and pedestrian data analysis. The data strategy was to calculate data “scores” that could translate to an easy-to-understand score — 5 (highest) to 1 (lowest). We accomplished that by leveraging Facebook’s Prophet module for automatic forecasting (rent) and sklearn’s minmax scaler to standardize across locations such as a crime “score”.

The interim result translated to scores by city. As an added benefit, we could ask users to weigh each city dimension based on their personal preference from 1 (lowest importance) to 10 (highest importance). For example, the user could weigh crime very important while deemphasizing air quality. I think this allowed us to wring even more value out of our data while presenting results that seemed to be more tailored to our users’ needs.

Don’t Get Siloed

As we progressed down the data/coding path we resolved to leverage the user’s livability weightings with respect to crime, rent, air quality, and walkability. Calculating a weighted score across those domains is straightforward math.

Up until this point, those individual scores were generated in each API route handler. So each handler (GET /crime_scr, GET /rent_rate, etc.) included explicit python statements to generate its relevant score.

Singular Route Generating a “Crime” Score

However, we had another dilemma. A composite, weighted score would need to first calculate each underlying score prior to producing the overall score.

Composite Score Code vs. Individual Scores Code

Addressing this dilemma led to a common architecture or code structuring question — leverage logic in multiple places in our app. To execute code that was already in place across multiple API route handlers we could merely duplicate the code (coding “copypasta”?). While this could be effective it runs counter to DRY (Don’t Repeat Yourself) programming principles.

The other option was to extract common code into discrete functions since calculating livability scores could be leveraged in multiple locations in our codebase. This seemed like a better choice as it would structure the logic for future enhancements and extensions.

My teammate and I dove back into our individual route handlers and pulled out the score calculation code into single-purpose functions. Then we updated our handlers to invoke those functions where needed.

This turned long, “heavy” handlers into leaner, focused API methods while also making the called functions more easily understood. See the code below for a concise handler that leverages a single function call to generate the primary output.

“Leaner” Code Using a Primary Function Call

The App

As data science team members, our part of the app is abstracted from the end-user and implemented as an API layer to be consumed by the upstream web and mobile app functionality. In short, the API functionality surfaced web service routes that returned livability scores for these dimensions of city life: crime, rent, walkability, and air quality. For our MVP release, as a team, we closed on supporting the top 100 cities by population. The final MVP end-user-oriented route returned a composite or overall city score taking into account the user’s preferred livability weighting.

Returning an Overall City Score for New York City

The API also surfaced supporting routes delivering data to the front-end for transactional purposes. These include routes to return the list of supported cities, city populations, and database connection status information.

The App & Beyond

The primary objective of the project was to engage as a team integrating multiple programming and operating disciplines simulating real-world projects. To that end, we focused on a limited MVP. The project may continue with a new Labs team who could be charged with extending the app. Potential enhancements may include expanding the list of supported cities (ideally nationwide), including additional livability dimensions, map-based searching, and side-by-side city comparisons.

The next team will face their own set of project challenges. With these potential extensions, finding suitable data sources will always be a task to be tackled with. That will include selecting the appropriate modeling methods: generating scores or metrics, building predictive models, and producing user-friendly visualizations.

The End Game

I think the project met our expectations. The team worked well together in a very relaxed, supportive manner even when we experienced the loss of a couple of teammates due to other commitments. As a Product Manager, I think my colleagues who were new to development projects experience a very real team experience. This included building a shared vision, communicating issues, documenting resolutions, and integrating our code using common tools such as GitHub, AWS Elastic Beanstalk, and major development frameworks. While the project roadmap was familiar to me, it was a great education to live it from a developer’s mindset as opposed to a Product or Program Manager.

With that, I come to the end of the path both for this Lambda Labs project and my Lambda School journey. Both represent steps in what I think (and hope) is a much longer trek of learning and growth.

For me as a Product Manager, the lab experience illustrated many of the elements I observed technical teams address in past lives — defining business goals, understanding requirements, applying design patterns, and refactoring code.

As for Lambda School, an ancient proverb says that “a journey of a thousand miles begins with a single step” (wiki). I think I just took my first step and am looking forward to seeing what comes next.

Product development mensch. I dig Go, Angular, MongoDB, and Lean Startup. Studying Data Science at Lambda School (DSPT4)

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store