1946 views
 owned this note
Areas-of-Interest for OpenStreetMap (AOI for OSM) -- White Paper Draft v.01 === Authors: Kang and Keller. Many thanks to Jerry *(and others like Joost, Simon, Ilya, Martin, tbc.)*. This is an early draft of a white paper - and a collaborative 'HackMD' document - about Areas-of-Interest (AOI) for and from OpenStreetMap (OSM). # Abstract This is an abstract for a conference proposal (~2500 chars) with the title **"Areas-of-Interest for OpenStreetMap with Big Spatial Data Analytics"**. This is a work-in-progress report about "Areas-of-Interest" (AOI) for OpenStreetMap (OSM) using big spatial data analytics. OSM is a free map of the world based on a collaborative, volunteered effort. OSM is a promising, yet underestimated alternative to Google Maps (GMaps). AOI have been introduced in GMaps around mid 2016. They highlight places with the highest concentration of restaurants, shops and bars in an light orange style. In addition it is suspected that traffic of human activities/tracks as well as ratings of Points-of-Interest (POI) are other factors for the algorithm. However, it remains intransparent and non-reproducible how AOIs exactly are defined in GMaps. This is where OSM makes a difference among others. We will explain and document the algorithm used in this work. Before all, some design decisions have had to be made: Our AOI concentrates on map visualizations on city and neighbourhood level and leaves buildings and POI as is at street level as part of the base map. We don't classify nor personalise the AOI to assumed information needs, like travelling, shopping or cultural entertainment, but respect all of them as POI. In this work an AOI is defined as an *"Urban area at city or neighbourhood level with a high concentration of POI, and typically located along a street of high spatial importance"*. The first step of the algorithm is straight forward; it selects relevant POI and leaves out irrelevant ones, like e.g. commercial offices. Then - as simple as the definition sounds - it implies quite some challenges: The first issue is about efficiency and the second one about relevance. The spatial aggregation algorithm of POI and the related building outlines need to efficiently process massive data at worldwide level. As clustering algorithm we choose Density-Based Spatial Clustering of Applications with Noise (DBSCAN). Then, this spatial aggregation outline is being mixed and aligned with streets of high spatial importance. For relevance calculation we used a network centrality algorithm. This centrality measure borrows from research on labelling hierarchies for street maps. From the results, our algorithms select aggregates that break a heuristically found threshold, and it labels them as AOI. We will evaluate this algorithm to manually selected test areas. This contribution concludes with possible use cases, like tourist maps, and gives an outlook about further use cases, refinements and implementations of AOI. # Introduction The goal of OpenSteetMap (OSM) based Areas-of-Interest (AOI) is to support visitors of crowded, mostly urban areas typically as part of a base map. Although being inspired by Google ('G\*') Maps (GMaps) and it's orange-yellow areal style, these AOIs are more than an alternative to GMaps. The resulting output dataset of AOI for OSM project is - or will be - freely accessible, openly defined and reproducable. By analysing and deriving new insights from OSM and other open data, AOI for OSM fits well to projects like 'OSM Landuse/landcover' ([osmlanduse.org](http://osmlanduse.org/)). Similar to osmlanduse.org it could be worthwhile to implement a website which interactively shows parameter the algorithm uses in order to determine if an area becomes an AOI. ## Some design considerations The AOI for OSM map style could appear like between and including zoom levels 12 and 16. It can be seen as a spatial subdivision of residential areas which are clearly visible e.g. in a medium-grey style in the default map of OSM (style 'Standard' (Carto) on osm.org) . It's important to distinguish between AOIs which are at building level (for instance cluster near the Altes Zeughaus), and more areal AOI (e.g, the main shopping streets in Rapperswil). We're focussing on areal AOI which of course rely on building level AOIs. A fundamental design decision is either to try to differentiate AOIs according to a use case (or classification) and to calculate the AOI before-hand or at realtime. There are even retail areas (see e.g. [landuse=retail here]( https://www.openstreetmap.org/way/364714195)) which can be directly used as AOI input. The issue is that there are many gaps to be filled in OSM. Different classifications exist (e.g. shopping, cultural, entertainment; see also POI classes e.g. in map.search.ch), transportation modes (alias routing profiles like car, public transport(?), bycicle, pedestrian). One challenge is to identify a list of Points-of-Interest (POI) including tags to exclude (see annexes). Another challenge is the spatial clustering of points and polygons (see separate chapter), and possible alignment to "commercial street corridors" (Annechino and Cheng 2011). ## A look at GMaps' AOI Google did not disclose the implementation used for AOI in GMaps. There's only this short sentence in a blog post: "We determine 'areas of interest' with an algorithmic process that allows us to highlight the areas with the highest concentration of restaurants, bars and shops. In high-density areas like New York City, we use a human touch to make sure we’re showing the most active areas." -[Google Blog](https://blog.google/products/maps/discover-action-around-you-with-updated/) From this citation, two assertions stand out: "concentration" (of POIs) and "active" (user logs?). On a first glance at the map it might seem that GMaps is just highlighting buildings with restaurants, bars, shops, etc, and then highlighting orange, the areas that have such places nearby, and the more concentrated these buildings are, the bigger the area of the orange highlights. However, an attempt to figure out how they selected their AOI shows that it is not quite as simple. It follows a report about investigations in selected places, like Switzerland, Great Britain, USA (New York City, San Francisco) and Singapore. Firstly, GMaps seems to select these areas 'by tag', namely the kind of buildings that they are. A quick search shows that Singapore's frills-free hotels/budget motels - which are rather infamous to locals - like Fragrance Hotel and Hotel 81 appear to be highlighted as AOI, even those in more obscure or shady parts of the city. More searching into accomodation reveals that hotels, hostels, motels, BnBs, resorts, etc, are all tagged as AOIs. Conversely, residential apartments and houses, office buildings and nature parks seems to be excluded as AOIs. There seems to be one exception to the highlighting of AOIs though, which is when GMaps does not have the building footprints of these places, they cannot tag it as an AOI, even if they should be, according to their own algorithm (which will be recorded in an example below). Another interesting point to note is the fact that even though certain places of interest might essentially be serving the same purpose, GMaps might highlight one as an AOI, but not the other. This leads us to believe that their AOI determining algorithm is more than just by tags or purported "concentration" or "active areas". For example, we can take a look at 3 standalone butcher shops in Switzerland, namely: Rickenbach Metzgerei GmbH, Lehmann's Hombi-Metzg Gmbh and Metzgerei Brönnimann. Rickenbach is a standalone butcher shop, which is not an AOI on GMaps, located in the municipality of Galgenen, in March District of the canton Schwyz, Switzerland. According to GMaps, the only AOI in an approximate 200m radius, are a bar, a restaurant, an appliance store, and a hairdresser. One might hypothesize that the population of the location might be a factor in play, in other words, not as "active" (Galgenen population: ~4500, canton Schwyz population: ~155k), so on contrary, let's take a look at another butcher in a more populated municipality and canton: Butchery Keller AG, located in Wiedikon (population ~45k) in the canton of Zürich, supposedly a "concentrated" and "active" city (population ~15m). Keller, like Rickenbach, is a standalone butcher which is also not an AOI on GMaps, although it is surrounded by a number of nearby locations tagged as AOIs. In addition, Keller also has 19 Google reviews, which Rickenbach does not have. In contrast, we have Metzgerei Brönnimann, located in the municipality of Rapperswil-Jona (population: ~26k) of canton Sankt Gallen (population ~500k). This butcher is another standalone butcher shop, surrounded by lesser AOI as compared to Keller, in a less populous city and canton than that of Keller, with only 8 Google reviews, yet it is marked as an AOI on GMaps. Another similar example would be water parks in Switzerland: * Splash e Spa Tamano has no building footprint and is not an AOI * Alpamare has a building footprint but is not an AOI * AquaParc has both a building footprint and is an AOI And also nail salons: * No Limit Nails in Jona has a building footprint but not AOI * Juup Nails, Singapore has a building footprint and is AOI It is clear that AOI is more or less a directly derived from of POI (boundary), therefore, at z17 where all building footprints are visible, it is easy to discern out districts with importance. However, we are also more interested in areal AOI at z16, where the building footprints are not visible, more specifically, from observations, at z17, certain areas are grey buildings, but at z16, these areas, when aggregated with the nearby POIs, become a rectangular AOI district, which is also true for the converse: certain buildings identified as POI at z17 are not classified or aggregated into areal AOI at z16. We seek to find out why in the following. Following observation depicts an interesting phenomenon: The [Petrini Shopping Plaza](https://www.google.ch/maps/@37.7762167,-122.4449187,16.75z?hl=en) in San Francisco is not seen as an areal AOI at z16.75, however, it is [at z17](https://www.google.ch/maps/@37.7761854,-122.4451177,17z?hl=en). What is puzzling is that nearby, certain buildings in the University of San Francisco (of which includes the dining/bookshop building and the gymnasium) are shown despite the fact that 1. the physical size of the highlighted buildings in USF and Petrini Shopping Plaza is comparable and 2. Petrini Shopping Plaza is surrounded by more nearby POIs and the highlighted USF buildings are largely isolated from nearby POIs. The converse can also be observed. An example used would be the road stretching from 899 Haight Street across 600 Haight Street in comparison with the nearby area encompassing 300-310 Broderick Street. [At z16](https://www.google.ch/maps/@37.7731279,-122.4349478,16.75z?hl=en), 600-899 Haight Street can be discernibly be identified as an AOI district, however on futher zoom, [at z17](https://www.google.ch/maps/@37.7733453,-122.4354492,17z?hl=en), it is clear that the POIs along the street are very sparse and such buildings are rather comparatively small in size. On the other hand, in the nearby Broderick Street, the building cluster that houses a number of eateries and groceries does not seem to be aggregated to the AOI street just beside this building. A more exaggerated example can be seen in the (also nearby) area South of Market Street. [At z16](https://www.google.ch/maps/@37.7740472,-122.4134038,16.75z?hl=en), it can be seen as an entire open space of grey streets and buildings, but on a further zoome into [z17](https://www.google.ch/maps/@37.7742951,-122.4133023,17z?hl=en), one can actually see that the area is quite littered with POIs, including bars, nightclubs, accomodation and even interesting buildings like Uber HQ. A little towards the west and one can also find a myriad of cafes, restaurants and bars which are all highlighted as POIs but not in an areal AOI, except for the (comparatively) small space around [Smitten Ice Cream.](https://www.google.ch/maps/@37.7761514,-122.4203536,16.75z?hl=en) These observations led to the following evaluations, noted below, in an attempt to reverse engineer GMaps' algorithms. ## A look at AVUXI The Barcelona-based startup AVUXI offers "Heat Maps" and an API in a product called TopPlace™. Interestingly they differentiate following categories: * Sightseeing * Eating * Shopping * Nightlife See https://www.avuxi.com/heat-maps-demo . # Evaluation of Observations ## Discussion It seems that GMaps is largely created using algorithms (["These building footprints, complete with height detail, are algorithmically created"](https://maps.googleblog.com/2012/10/expanded-coverage-of-building.html) on creating building footprints, ["We determine “areas of interest” with an algorithmic process"](https://blog.google/products/maps/discover-action-around-you-with-updated/) on their AOI determining process), as compared to OSM which is input by human users. The trade-off that could be observed would be that OSM would be more likely to have a higher quality in its maps - given the assumption that all the user contributed data are accurate and that there are user contributors mapping even the most rural parts of the world. The difference in the level of details could be present in the following comparison ([cr: Reddit user parabol443 for the links](https://www.reddit.com/r/programming/comments/7kxkrt/google_mapss_moat/dri8jff/)): * Museum of Pennsylvania: [Google](https://www.google.com/maps/@40.2655139,-76.8852932,17.54z) vs [OSM](https://www.openstreetmap.org/#map=18/40.26581/-76.88533) * Mulholland Drive, LA: [Google](https://www.google.com/maps/@34.124951,-118.4039111,17z) vs [OSM](https://www.openstreetmap.org/#map=17/34.12360/-118.40251) * Queens, NY: [Google](https://www.google.de/maps/@40.7236365,-73.7372316,17z) vs [OSM](https://www.openstreetmap.org/#map=18/40.72305/-73.73682) * Pine Street, SF: [Google](https://www.google.com/maps/@37.7927273,-122.3979334,18z) vs [OSM](https://www.openstreetmap.org/#map=18/37.79302/-122.39800) GMaps' algorithm, as of writing, is also not flawless. There are certain places where its algorithm has failed to come up with a building footprint while the same building was correctly mapped out on OSM. An example would be a missing Block 143 (amongst other nearby buildings, like the Leng Hup San Chee Chea Temple, which also face the same problem) on [GMaps](https://www.google.ch/maps/place/143+Teck+Whye+Ln,+Singapore+680143/@1.3806038,103.7530876,20.5z/data=!4m5!3m4!1s0x31da11bf710be3b1:0xf77eb85deabf4873!8m2!3d1.380557!4d103.753062?hl=en) which can otherwise be seen on [OSM](https://www.openstreetmap.org/search?query=blk%20143%20teck%20whye%20lane#map=19/1.38060/103.75366). These observations (from the previous section) pose the question as to how GMaps determines which locations qualify as an AOI as they imply that AOIs selected are based on more than just the building densities (as mentioned in the blog), or the earlier hypothesized population of the town/city/area the building is located. More data must be at play in their AOI algorithm - perhaps the population density, size of location, nearby streets, affluence of location, user logs and ratings, user contributions (indoor floor plan), and even more, of which needs to be verified. The hypothesis is also backed up by several others who have written about their findings: Laura Bliss from [citylab](https://www.citylab.com/design/2016/08/google-maps-areas-of-interest/493670/) poses the question "Point being: Could it be that income, ethnicity, and Internet access track with "areas of interest?". Findings from [searchenginewatch](https://searchenginewatch.com/2016/08/31/google-maps-update-decides-which-areas-are-of-interest-for-users/) are similar in reporting that despite being a commercially busy area, certain parts of such areas, more notably, the parts with a lower socioeconomical demographic, were not highlighted as AOIs. What is of more interest to our project is that the findings from these articles back our theory that GMaps might be selecting AOI based on certain user tracks like from Gmaps app users. More importantly to us, this leads to an interesting decision fork for us at OSM to decide more carefully the tags we want to include in our AOI dataset and the use cases we would want to have. As GMaps' dataset and data collection is all proprietary-based, therefore, as an FOSS, OSM has to come up with a different way of approaching such an algorithm. There has also been certain observations which raise a few points of interest worth looking into. For example, in a large [retail park, Castle Marina, Nottingham](https://www.google.co.uk/maps/place/Currys+PC+World+featuring+Carphone+Warehouse/@52.9468365,-1.1602752,16.75z/data=!3m1!5s0x4879c22efe53513b:0x64dca853c03b8b2c!4m5!3m4!1s0x4879c224e7cfb1f5:0x905c1704a0de1ff3!8m2!3d52.9465179!4d-1.1585809) (UK), at z16 only one store is highlighted as an AOI - Currys PC World - coincidentally and apparently the only store in the nearby cluster that has provided a store plan to Google. Another phenomenon to note which can be seen from the same link is that while z16 shows Currys PC World as the only AOI, a deeper zoom into z17 unveils a whole cluster of AOI that was not visible at z16, despite the fact that certain buildings only highlighted as AOI are more relevant as AOI, or are physically larger than Currys PC World. An interesting point to note though, however, is that at z16, these 'invisible' AOIs do not have their building footprints shown on the map, whereas Currys' is shown. Whether this reinforces early beliefs that a building can only be listed as an AOI if it has a building footprint still needs to be further looked into. Another observation is that in Nottingham Trent University (UK), certain university blocks are highlighted ([School of Architecture, Design and the Built Environment](https://www.google.ch/maps/@52.9571889,-1.1529814,17z?hl=en)) as AOI whereas some were not ([School of Art and Design at Waverley building](http://www4.ntu.ac.uk/map_files/City_2D.pdf)). A further investigation as to why the Mudslay building was highlighted reveals that it actually has a [cosmetic shop in the building](https://www.google.ch/maps/@52.9574192,-1.1528538,19z?hl=en), of which only has its place label appear at z19. And as for why the entire building was labeled as an AOI with only just a cosmetic retail shop in the building (if that is the reason the building was highlighted at all) is also in contrast to the fact that only part of the [Nottingham University Hospitals NHS Trust building were highlighted](https://www.google.ch/maps/@52.9435301,-1.1855671,18z?hl=en) despite it being a single building. At which zoom levels the place labels appear is once again, decided by Google's algorithm, ([Source](https://support.google.com/business/answer/6056435)) and it might be linked to the zoom levels where building footprints are available as well. However, the fact that one zoom level can make a world of difference in the AOI being shown, as observed in Castle Marina Retail Park, shows that there leaves something to be desired. ## Compilation of observations Focussing on areal AOIs - as opposite to identifying and coloring single buildings as AOIs - one can summarize the observations on the functionality of GMaps in following hypotheses, presumed evidence (and weight/presumed threshold) as well as the verification 'status'. 1. "AOIs are selected based on POI class": High evidence and high weight. Status: Verified. 1. "AOIs that are in areas with dense traffic get higher chance to be selected": See following items and be aware that these areas in this context means visited by people going outside. This is not to be confused with inhabitants (as in a census): Some evidence. Status: Very hard to verify. 1. "AOIs are selected based on (mostly unconscious) analysed user tracks from mobile apps": Some evidence. Status: Very hard to verify. 1. "AOIs are selected based on user reviews and likes": Some evidence. Status: Very hard to verify. 1. "AOIs are selected based on other user engagement and/or endorsement like "G\* My Business", submitted floor plans, etc." : Evidence unclear. Status: Unverified. 1. "AOIs are spatially clustered from POI points and building with some influence of street lines" (AOI buildings are aggregated and sometimes look like they are buffered symmetrically on both street sides, even when on the other side there are no AOIs): High evidence and weight. Status: Verified. 1. "AOIs are selected based on density of surrounding POIs (i.e. a certain amount of POIs must exist within a given radius)": Strong evidence and weight. Status: Verified. 1. "POI buildings have a certain weightage and when they hit the a certain threshold, buildings along a street or road would be aggregated into an areal AOI": Some evidence. Status: Very hard to verify. 1. "There is a aggregation or extrapolation of some levels performed on small scaled POIs to make the entire building that houses them into an AOI.": Strong evidence. Status: Largely verified. 1. "AOIs are aggregated from POIs and labeled as an entire stretch of street of a certain bounding box": Some evidence. Status: Hard to verify. # Design Choices for an OSM based Approach There are a few design choices that OSM can attempt to implement. Our challenges would be on deciding the use cases that we want OSM to cover, on how to make use of existing and other open-sourced datasets to implement an AOI function, and also if we could build on our existing features. A feature already existing in OSM would be the routing feature whereby a user can be shown different routing between places depending on the modes of transport (car, bicycle, by foot, and possibly in the future, by public transport). An idea that can be built on this feature is to have routing for "tourists" or "nightlife" whereby users can input and contribute where they know might have high activity or local recommendations of places to go. A similar function is available on map.search.ch where users can select what kind of amenities or locations they want visible on their maps, but we could use it along with user generated data to create stretches or areas of interest which could be highlighted with a colored overlay to show users that for example, the entire streets along Clarke Quay, Singapore or Langstrasse, Zürich, are go-to places brimming with nightlife. ## Overall approach The main input is OpenStreetMap data ("planet file") worldwide or a country or region. The Output is a geographic dataset, e.g. a layer with multi-polygons (possibly in GeoJSON format). An early proposal of steps might be: 1. Selecting relevant tags 2. Analysing user-contributed data 3. Analysing other crowd-sourced data (with adequate license) 4. Apply spatial clustering 5. Analyze results from spatial clustering 6. Filter and apply new results (possibly as overlay) As we observed from GMaps, certain locations are most likely to be AOI-worthy whereas others we might want to definitely exclude out. To start off, we would want to start mapping out OSM tags that we would want to include and also exceptions. From there, we can use other input data like log files of page views at certain zoom levels, or trending places or highly dense of Flickr photograph geocoordinates, or list of events/festivals and other openly available datasets to help churn and further sieve out areas of interest that we can output to our users. We can also implement gamification, like Kort, to further help automate certain parts of AOI generation. Last but not least, as per the fundaments of OSM, we could implement areas where users themselves can contribute data through inputs or checking of errata to further refine our AOI features and improve its quality. A future possible work to think about might be implementing a sort of "playlist" or rather "sightseeing list" (community based, like Wikipedia, rather than individualized lists, to ensure quality, prevent repetitive lists, and also improve use by only downloading nearby areas or preset areas) whereby users can add and edit such a list for every town/city/community, much like how Spotify users and YouTube users can add songs into their playlists for other users to see. It is food for thought as there are countless threads posted online asking similar questions on where are the interesting places in certain cities, and even on largely updated pages, like TripAdvisor, certain information submitted by users might still be outdated, thus a community-based "sightseeing list" that is updated regularly in realtime might be of certain value to our userbase. Once we have establish our building AOIs or active POIs, we can then apply spatial clustering with a certain predetermined or localized threshold to filter out clusters with high density of building AOIs. From there, we can analyze the results and filter out streets or areas which are AOIs, and perhaps cross check from sources the places which are active. We can then aggregate such streets or districts and highlight them as a sort of overlay over the pre-existing map. ## Spatial analysis The algorithm "DBSCAN density-based spatial clustering" looks promising for spatial clustering of points (see references). Question is whether it’s enough to estimate parameters like minPts and minDistance/epsilon globally – or if local adaption is needed. Then an "Network Centrality" algorithm detects more 'important' nodes in an urban street network (see references). At this current juncture in writing, it appears that a certain amount of localization is required to making DBSCAN meaningful to native users. A densely populated area would require a smaller minDistance and higher minPts as parameters as compared to a sparsely populated area which would require an adjustment to have a higher minDistance and lower minPts. The specifics of these matrics that we are looking at are currently unknown at this point but will be looked into. The rationale for this is because our implementation of AOI starts with identifying POI, which means that most establishments of commercial services (which are observed to be one of the most common type of buildings) would be tagged as an POI. As such, we need to use of DBSCAN to filter out clusters of such POIs to be considered in our AOI system. An example would be [near the Union Square](https://www.google.ch/maps/@37.787492,-122.4059143,16.75z?hl=en) in San Francisco. At z16, you can see that more than half of the map is tagged as an AOI. On further zoom at z17, one can observe that almost every building in that area are tagged as POI. As San Francisco is a very densely populated area, especially the business district aka Financial District, there needs to be a differentiation between the busy streets and roads and POIs. With adjustments to DBSCAN's minPts and minDistance, we can seek to sieve out the "more active" AOI in an already active area. The Network Centrality algorithm can also be then applied to help aggregate scores on "nodes" using a scoring rubric based on a mixture of its 4 basic principles of deciding centralness: 1. prestige (as it takes into account how "credible" a certain area is by means of what others say, i.e. through online searches, reviews, etc, maybe?), 2. closeness and betweenness (as generally and anecdotally speaking, the closer and well connected a POI is to a main street/public transport, it would be more accessible and more visited) 3. a specialized degree algorithm can be used where by a cluster's degree is calculated with respect to POIs or public amenities (like transport stops) as technically, the more POIs a cluster is connected to, the more "active" the district it should be. And afterwhich, when a cluster of POIs reaches a certain threshold, a certain street or lane would be tagged as an AOI. An overview of the suggested approach: 1. Using DBSCAN with localized parameters, identify clusters and noise 2. Treat the clusters we identified as a single node entity 3. Using the Network Centrality algorithm, assign each node a value 4. Identify nodes that surpass a threshold value and highlight the area with the same street name as AOI It is not apparent the algorithm GMaps used to label their AOI, as they certainly have a huge collection of information and data to work with, however, it seems that at the initial level, the suggested approach might be able to slightly replicate GMaps' AOI feature in general. However, to implement this approach, we would definitely require more help in terms of possibly local knowledge (in term of coming up with a score assigning algorithm to make the Network Centrality algorithm more meaningful to our cause). With that said, a great starting place would definitely to start with the resources we currently possess (see references) on a familiar area, and then refining our way to other parts of the globe. A great place to start would be with OSMnx, a new tool that is able to help in street network analysis (link under references). OSMnx has features that download and build topologically-corrected street maps (which may also be useful if we might be looking at implementing AOI at z16 vs POI at z17 and also maybe helps in code efficiency and map requests as OSM works with slippy maps). As OSM map data includes curved points on a street as nodes (due to the mechanism of tagging road segments on OSM), OSMnx can correct the OSM data into a topologically-correct, mathematical graph. With this, it removes unnecessary nodes, making it easier to work with OSM data. OSMnx also provides several visualization features that help identify street segments by lengths, one way vs two way streets, and etc. Street network analysis can then be done and with the aid of a weighing system using the Network Centrality algorithm as a model, we can achieve an efficient AOI tagging system. [This link](https://github.com/gboeing/osmnx-examples/blob/master/notebooks/08-example-line-graph.ipynb) shows how OSMnx can be utilized to display network centrality on a visual network graph. ## Potential input data This is a compilation of potential input data: 1. AOIs are selected based on POI class: see Appendix. 1. POIs can have a popularity weightage determined by certain factors, for example: waiting time, volume of people (across different time periods), population density against time period (if nearby buildings are all offices and peak period around noon, evening, etc, a location may be popular/AOI-worthy as compared to say a cluster of eateries off a quite alley), of course, the difficult part is attaining such data for us to weigh 1. Page views on osm.org give evidence of activity and contributes medium weight to be selected as AOIs: (see ['Trending Places'](http://geometalab.github.io/Trending-Places-in-OpenStreetMap/))... 1. As we are looking at AOI, we need to identify places with high traces of human activities (see [sightsmap](http://www.sightsmap.com/) and its [wiki page](https://en.wikipedia.org/wiki/Sightsmap)) 1. Misc. proposals of user-generated data (license checked?): 1. Using [Strava map](https://labs.strava.com/heatmap/#15.09/8.82008/47.22521/hot/all) data to find highly frequented streets (API is available, but rate limiting exists) 1. Using Twitter text analytics to find places much talked about (can create a hashtag exclusively for OSM AOI use or crawl for related hashtags [general ones like #nightlife, #disco or more app-specific ones like #tripadvisor or #foursquare{has to check fair use if public hashtags of companies can be used}], a quick glance through Twitter developer agreements and policy shows that it is generally ok to use their data) 1. Similarly to Twitter, we can use datasets from similar apps like Instagram (which should be the most relevant), Facebook and even FourSquare or Yelp!. Most of them has APIs which allows you to non commercial use of public posts, of which we only require the location or hashtag data which means it should be fair use 1. A notable site I came across is [this](http://www.emilio.ferrara.name/datasets/), a researcher based in USC has a dataset on FB and Instagram from whom we can consider requesting access to and [this](https://gwu-libraries.github.io/sfm-ui/posts/2017-09-14-twitter-data) 1. Apps like Kort to further filter out unwanted buildings, or even add in new features to Kort whereby users can add in "playlists" of places/reviews/information correlating to popularity of a place, i.e. average waiting time, crowdedness (to derive for a weight of how popular a location is), etc -> an extension to just correcting wrong information or adding missing information -> consolidation of all functions we want to achieve under one platform likely to increase userbase and developer base as well, if possible, 1. (*tbc.*) *[Template text: "Xxx gives evidence of activity and contributes yyy weight to be selected as AOIs..."]* # Status and Outlook This is work in progress. We're still welcoming comments on everything :smile:. No implementation started yet. tbc. # RESOURCES Bibliography: * "Google’s best travel feature is an orange blob" by Jacob Kastrenakes, [The Verge (July 9, 2017)](https://www.theverge.com/2017/7/9/15941406/google-maps-best-travel-feature-areas-of-interest) * "Google maps’s moat" [by O'Beirne (Dec 2017)](https://www.justinobeirne.com/google-maps-moat) * [Google original blog about AOI](https://blog.google/products/maps/discover-action-around-you-with-updated/) * [Article by citylab on socioeconomical trends in AOIs](https://www.citylab.com/design/2016/08/google-maps-areas-of-interest/493670/) * [Article by searchenginewatch on socioeconomical AOI selection (or lack thereof) and possible implications](https://searchenginewatch.com/2016/08/31/google-maps-update-decides-which-areas-are-of-interest-for-users/) * [Google Maps API Documentation](https://developers.google.com/maps/documentation/android-api/poi) * [search.ch - Swiss search engine](https://map.search.ch/) * "A very fast density based clustering library for geographic points" (DBSCAN) by Vladimir Agafonkin. [Weblink](https://github.com/mapbox/dobbyscan). * "Urban Street Network Centrality" by Geoff Boeing. [Weblink](http://geoffboeing.com/2018/01/urban-street-network-centrality/). * [Mapping Flickr](http://geonet.oii.ox.ac.uk/blog/mapping-flickr/) * Gridded Population of the World (GPW) v4. [Weblink]( http://sedac.ciesin.columbia.edu/data/collection/gpw-v4). # APPENDICES ## Appendix: OSM Tags for AOI This is a list of places people visit on a daily basis, e.g. grocers, supermarkets, boutiques, clinics. Generally these are places that are widely distributed, (e.g. car shops, farms) and also places that can be found almost everywhere, (e.g. convenience stores, kiosks, public toilets and benches, etc.). These are places and buildings which are part of the area of an AOI. But there are also OSM objects (like ATM, which currently is marked as exception below) which contribute to a kind of 'importance' where a threshold decides if it's an AOI. A decision should be made which locations should be included as AOIs, like for example in GMaps colleges, universities and certain school campuses are marked as AOI (e.g. ITE College West, Singapore) and some of these campuses have restaurants and canteens that are open to the public. List: * shop=mall, =bakery, =beverages, =butcher, =chocolate, =coffee, =confectionery, =deli, =frozen_food, =greengrocer, =healthfood, =ice_cream, =pasta, =pastry, =seafood, =spices, =tea, =department_store, =supermarket, =bag, =boutique, =clothes, =fashion, =jewelry, =leather, =shoes, =tailor, =watches, =chemist, =cosmetics, =hairdresser, =medical_supply, =electrical, =hareware, =electronics, =sports, =swimming_pool, =collector, =games, =music, =books, =gift, =stationery, =ticket, =laundry, =pet, =tobacco, =toys * amenity=pub, =bar, =cafe, =restaurant, =pharmacy, =bank, =fast_food, =food_court, =ice_cream, =library, =music_school, =school, =language_school, =ferry_terminal, =clinic, =doctors, =hospital, =pharmacy, =veterinary, =dentist, =arts_centre, =cinema, =community_centre, =casino, =fountain, =nightclub, =studio, =theatre, =dojo, =internet_cafe, =marketplace, =post_opffice, =townhall * leisure=adult_gaming_centre, =amusement_arcade, =beach_resort, =fitness_centre, =garden, =ice_rink, =sports_centre, =water_park * ... ## Appendix: OSM Tag Exceptions for AOI List of tags being exceptions and not to be included in the AOI boundary calculation: * shop=vacant, =farm, =car, =convenience, =dairy, =kiosk, =bicycle, =boat, =jetski, =motorcycle, =snowmobile * amenity=public_bookcase, =bicycle_parking, =bicycle_rental, =boat_rental, =boat_sharing, =bus_station, =car_rental, =car_sharing, =fuel, =motorcycle_parking, =parking, =parking_entrance, =parking_space, =taxi, =atm, =bench, =clock, =coworking_space, =grave_yard, =cemetery, =kitchen, =place_of_worship, =post_box, =recycling, =reuse_station, =shelter, =shower, =table, =sanitary_dump_station, =telephone, =toilets, =vending_machine, =waste_basket, =waste_disposal, =waste_transfer_station, =watering_place, =water_point, =fire_station, =police * office=* (Almost all offices are not AOI in GMaps with the exceptions of certain high-rise buildings that also house POI like restaurants, cafes, etc) * leisure=bird_hide, =common, =disc_golf_course, =dog_park, =fishing, =miniature_golf, =nature_reserve, =park, =picnic_table, =pitch, =playground, =summer_camp, =wildlife_hide * (retail parks/strip malls) * *(tbc.)* # (SCRATCHPAD) These is just a scratchpad section mostly with notes to self and to others... * I stumbled upon this heatmap of where people take the most photos, which was then overlaid on top of GMaps. According to Wikipedia, its heatmap was derived from crowd sourced locations and it might be worth a look/collaboration -- by Kang? * ... Table: | col1 | col2 (right aligned)| | ------| -------------------:| | ... | xxx | **Table of contents:** [TOC] [THE END]