Introduction to Geographic Information Systems (UEP 0232-01)
GIS Data Quality Assessment: Expansion of the Green Line –
What to consider when implementing a new station at College Avenue/ Boston Avenue, Medford?
Description of the project
With the extension of the Green Line to College Avenue/ Boston Avenue, Medford, some questions automatically arise concerning the feasibility of such a project. Naturally, the new stations will improve public transport in the areas that are so far underserved. In this case one of the benefiters would be, among others, Tufts University. However, it is not so clear what needs to be done in the area that is affected in order to make the station work. This concerns, for instance, current land-uses, existing infrastructure or buildings in the surrounding area of the future station. All this probably needs to be revised in order to make the area conforming and working to the new stations. To give only some examples here, once the station is going to be introduced, a higher share of pedestrians and cyclists can be expected for whom an appropriate infrastructure needs to be built. Also, it can be expected that in order to have a more vibrant life around the new station there should be more commercial use than there currently is. Thus, a revision of the existing buildings and their current purpose needs to be undertaken. The new station might bear some additional difficulties, since it is located very close to the boundaries of Somerville and Medford. Because of this potentially blurred responsibility some special attention will be paid to this issue. In the following I will give an overview of different layers I deemed as a potential help, and, say whether these are really useful for the purpose of my project.
In order to revise the area for future changes some things need to be sort out regarding the cities boundaries, existing rail lines, existing buildings, existing road centerlines, and current land uses. This will be necessary, since GIS layers bear different qualities, e.g. resolutions, accuracies, currencies, etc. The following questions could arise from analyzing these features:
- Where do different layers locate the existing road centerlines ? Are there any changes visible if map layers are used from different clearinghouses? This could be essential when roads have to be taken away, need to be restructured or simply narrowed in order to fit in the new station.
- Where are hydrographical features located in the surrounding area of the future station? Can any hazards be determined that could catastrophically affect the station? In case of heavy rain, for instance, there could be a flooding that would destabilize the surrounding region of the river. Thus, we need to make sure that the new station will be located in a considerable distance of any hydrographical features
- Where is the political boundary set for Medford and Somerville ? Are there any areas that show overlaps of the two municipalities or areas that show no correspondence to either municipality? This could be important when potential changes have to be made. Also, it has to be clear which municipality would be responsible for undertaking these.
- Where are the existing rail tracks localized ? This will be essential in order to implement the new rail tracks needed for the Green Line. Since the two new ones are going to be built in addition to the old ones we need a certain accuracy in order to know how much extra space is needed. This fact holds especially true for the area of the new station, since there will be some extra space needed in order to implement the station in between the two new tracks (see picture).
- Where does ArcGIS locate existing buildings ? In order to know how much space is available the planners need to know exactly where the already existing buildings are (see picture). Also, it is important to know which buildings exist in the surrounding area of the new station in order to, potentially, take them away. Thus, questions like, what are the sizes and heights of these buildings, could be of possible interest.
- What are the current land uses in the area around the new station? In order to make the area a more vivid one, see Davis Square, there is probably a need for more commercial use. However, currently, there are a lot of buildings used by Tufts University or have a residential purpose. Thus, it could be interesting for me to figure out where more commercial use could be implemented.
Discussion of Different Data Layers
1.) Street Centerlines
For the Street Centerlines I used layers from two different clearinghouses, one based on the TIGER 2000 and one on the city of Medford/Somerville. As can be seen for the layers created by the city of Medford and Somerville, these fit perfectly to the orthphoto provided by MassGIS. The centerlines for the streets are located where the actual streets are. What can be seen, though, is that the lines do not match 100% where they should come together (where Somerville and Medford abut). However, for our project this is not too dramatic, because the station will not be built at the border of the two municipalities. If any restructuring of College Avenue needs to be done, though, this error should be taken into consideration.
Also, both data sets are considerably old, the one for Medford is dating from 1990- 1999 (“ The exact date of this dataset is not certain, but its ground condition is from 1990 – 1999.”) and the one for Somerville from 2002. With regard to the positional accuracy, the City of Somerville mentions: “The center of physical roadway pavement may or may not represent the center of the road right of way. Road right of ways may taper or change width.” In general the two datasets seem to be quite suitable for the purpose of my project, though, there is not a lot of information available on positional accuracy or other constraints. Another, rather negative, aspect is the few additional information that come with this layer. The only information that could be of use would be the length of the street. Other than that we would not find anything useful in this layer (e.g. street types that could help identifying the actual use and throughput)
If we compare this to the layer created by the Census 2000 we see that the accuracy for the TIGER data is not really that well. The lines are quite far away from the actual streets and show overlaps in some sections with buildings or green spaces. This is not explainable by the year the data was collected (2000), as if in that time the streets were located differently, but solely due to the bad positional accuracy. Making use of this data layer could lead to considerably problematic outcomes, because the lines are not reliable for any sort of micro-level planning.
With regard to the positional accuracy the metadata says: “The positional accuracy varies with the source materials used, but generally the information is no better than the established national map Accuracy standards for 1:100,000-scale maps from the U.S. Geological Survey (USGS). […]The level of positional accuracy in the StreetMap files is not suitable for high-precision measurement applications such as engineering problems, property transfers, or other uses that might require highly accurate measurements of the earth's surface.” I assume in my project the latter statement holds true, and, thus, makes it difficult to get any use of this layer.
What has to be mentioned, though, is that the TIGER metadata provides a lot more detailed attribute information, e.g. how the data was processed and where it comes from. Also, it provides information on the street type (e.g. highways, major roads, local roads, minor roads, etc.) Hence, one could make assumptions on the actual throughput and importance of the street, if any changes would be considered. However, since the positional accuracy of this layer is, as shown, pretty bad, we should not use this layer for any planning on a micro-level.
For the hydrography I used layers from MassGIS and the National Atlas (ESRI Data Maps). For the first layer, the one created by MassGIS, it can be seen that the layer covers quite well the hydrography surrounding the new station. However, some places where there should be river, according to the basemap, do not show coverage. Regarding the positional accuracy, the metadata says: “Areas within many surface water supply watersheds have been enhanced by using higher resolution streams and lakes from the MassDEP Wetlands datalayer, many areas have also been field verified.” It also, covers additional information like how the data was collected and processed. The layer was created in 2010, thus, being quite current. All in all, the metadata provides good and current information, the layer shows a good positional accuracy, covering most of the actual water areas. Potential buffer areas, that needed to be calculated, would, thus, in most cases be quite precise. The only negative aspect here is that additional information is missing. Thus, we do not know how much water it carries, what the likelihood for a flooding is etc. This would be useful information for building the station in a safe environment.
The other layer shows the National Atlas Water Feature Lines (ESRI Data Maps). As can be seen, the layer is quite inconsistent, changing from polygon to line quite randomly. Thus, more or less starting with the highway crossing the Charles River, the river does not continue as a polygon but merely as a line. Because of this, the distance from the location of the new station to the river would be calculated quite imprecisely, and, thus, buffer zones would not represent the real distances. With respect to the positional accuracy, the metadata says: “The geospatial part of this data set was originally extracted from the individual 1:2,000,000-scale State boundary Digital Line Graph (DLG) files produced by the U.S. Geological Survey which have a positional accuracy of 1,720 meters. It was updated several times using various sources whose horizontal positional accuracies are unknown. […] Largest scale when displaying the data is 1:1,000,000”. In addition to this positional inaccuracy comes the obsolescence of the data that was collected from 1995-2002. Useful would, on the other hand, be the differentiation in water types, like swamps, rivers, channels, ponds, etc. However, in our area there seems to be only the Charles River that is in closer proximity to the future station.
In conclusion, the MassGIS provides more current and more accurate data than the one from National Atlas does. However, it does not give the opportunity to distinguish between different kinds of hydrographical features. If that would be useful for our project is another question, though. Further investigation would, anyways, be needed in order to make safe predictions on flooding probabilities.
3.) Political Boundaries
For the political boundaries I used the data provided by, both, the City of Medford and the City of Somerville. As can be seen on the right picture, there are several shortcomings when it comes to the positional accuracy of the layers. First of all, there is a mismatch concerning an area West of Tufts. Approximately four houses show no cover of municipal authority. It would, thus, be unclear which city was responsible, in case any changes would need to be made in this area (to be honest, though, this area is quite far away from the place of action). However, when we look closer at the outermost area of the cities, we see that there is quite a long stretch that is covered by both municipalities. Hence, an overlaps of the political boundaries can be seen. If there were any changes that needed to be done due to the new station, it would not be clear who is in charge for that. These areas are located quite close to the new station which would make it very likely that changes are needed in land use or zoning, for instance. Regarding the quality of the data I must say that zooming in on a 1:6000 scale (about 20 feet +/- accuracy) still shows some overlaps between the two cities. Thus, the data quality is not the highest (no further information on the positional accuracy is provided in the metadata).
In terms of currency it has to be said that the data was collected in the time between 1990 and 1999 (“The exact date of this dataset is not certain, but its ground condition is from 1990 - 1999.”) for Medford whereas the data for Somerville is from 2005. Some errors, or rather mismatches, in positional accuracy may also stem for this time lag. I can only speculate, but it seems to me quite unlikely, though, that there were any changes concerning the boundaries during the last decades. However, maybe more current data layers would show a higher quality, and, thus, adequacy on where the actual boundaries are.
With respect to the attribute accuracy one needs to say that there are only very few extra information provided by the layers. While both cities provide data on the size of their area, only Medford gives information on their population. And yet, it would be useful for me to know how many people are living in the two cities, if I was to take into consideration how many people were affected by the new station. Thus, additional data and clearinghouses would need to be consulted.
4.) Existing Rail Tracks
To show where existing rail tracks are, I used the layer provided by MassGIS. In order to implement the new rail tracks for the Green Line we need to know exactly, where the old tracks are located. This holds especially true for the area of the new station, because extra space is needed here in order to implement the station between the two new tracks. As can be seen on the picture to the right, the overall positional accuracy is very good. The illustrated tracks represent very well the existing ones. In order to make more detailed statements, however, we need to zoom in.
On the picture to the right, we see that even on a scale of 1:800 (approximately 2 feet +/- accuracy) the existing tracks are perfectly represented. Taking also into consideration that the data layer is quite new, the metadata says from 2008, we can make good use from this MassGIS layer. For instance, we could calculate the actual width available and needed for implementing the station and the two new tracks. All this is clearly visible when zooming further in into the map.
Also, the layer provides further useful information that could be useful when building the new tracks. The metadata says: “The layer includes active passenger, freight, and MBTA Commuter Rail and Rapid Transit railways, along with abandoned rail lines. In many instances there is more than one track per rail line, and rail yards and spurs are included. […]CTPS added several attributes pertaining to type of service, MBTA Commuter Rail status, rail line ownership, and freight and passenger operation.” Thus, if we wanted to know what kind of lines are running here (type of service) and to whom it belongs (ownership), we could look this up in the layer’s information. This could be useful when consulting the company in order to sort out how a future use of the area will look like. Assuming that the current rail lines will be somehow affected by the implementation of the Green Line, it is useful to know that the line is run by the MBTA and used by the MBTA and Amtrak. There are apparently two tracks in use and one that is abandoned or not in use anymore. Future investigations would be needed in order to find out what this third track’s purpose is.
5.) Existing Buildings
To show where the existing buildings are I used again the data layer from MassGIS. Using the layer showing the existing buildings I wanted to find out how much space is available for the new development of this place. Looking at the picture to the right, we see that from a zoomed out perspective the buildings seem to be quite well covered. Although we find two buildings in the bottom that are not represented that well (both of them belonging to Tufts University), these would not be that much of concern, since they are too far away from any restructuring processes. Zooming more into the map (picture to the right, scale of 1:800 (approximately 2 feet +/- accuracy), we see that some buildings of concern are not very well represented. Thus, suggestions for the planned Burget Neighborhood Path could be based on erroneous measurements. Also, we see on the right of the picture something that is not very well identifiable. However, having a quick look in google maps we can suppose that there is a building existing. Implementing the new Green Line tracks could have an effect here, since the MBTA probably needs some space of expansion (be it either for the tracks, the station or some other services). Thus, having the building on the map would be useful for future planning.
As for the currency, the data was collected in 2002. Thus, it is possible that some building extensions were made after that time (for instance, on the Tufts buildings that miss some more precise representation). However, the data seems mostly up to date and can, thus, be used without further doubts.
What other information can be deducted from the metadata? There is a pretty exact description of how the data was acquired and processed (installation of a camera in an aircraft, flights and reflights, calibration, edge matchings etc.). The metadata says: “For additional accuracy verification, static survey points were collected, using static benchmarks where available. Thirty-four survey points within the project boundary were selected to allow a statistical absolute elevation verification of the data. This data set was then statistically compared to the project LIDAR DEM data after the combination of flight lines to verify accuracy both horizontal and vertical. The RMSE (Root Mean Square Error) of the LIDAR DEM was calculated using the ground GPS data to ensure that the vertical error was less than 0.15 m.” Thus, we find very good background information on how the data was produced, and, could potentially detect where errors were made.
However, we find very little additional information in the layer that could help our questions on the new station. The only extra information provided is the shape size and the shape length. At least we can hereby say, how much space is actually covered by the buildings (naturally, only if the measurements are adequate). To give just one example, in order to implement the Burget Neighborhood Path we would precisely need to know where the buildings are, how much space they require and how much space can be used for things like trees, greenery and, maybe, bike racks. Additional information, like the height of the buildings, is, unfortunately, missing. If available, this could have been used for planning the Path more carefully and make it more fitting to its surrounding.
6.) Current Land Uses
In the last section I wanted to look closer at the current land uses. Thus, again I used the MassGIS data layer. Hereby, I wanted to find answers on what kind of land uses we encounter mostly around the station and whether it is possible to implement more commercial use. The assumption was that there are probably a lot of buildings already used by Tufts University and some other that have a residential purpose. However, some more commercial use is needed, once the new station is introduced.
I mapped the area around the new station according to the different land uses. The metadata says: “The MassGIS Land Use datalayer has 37 land use classifications interpreted from 1:25,000 aerial photography. The minimum mapping unit used was one acre.” I reduced the categories to the relevant ones that can be found in this area. As can be seen on the picture to the right, most of the area surrounding the new station is defined as residential area (yellow). In closer proximity, naturally, most buildings are institutional, because they are owned by Tufts University (blue). Also, the recreational areas (green) in the closer surrounding form part of Tufts. As a central axis along Boston Avenue/ College Avenue we have a strip of commercial use (red) that looks rather marginal on the map. In addition it stretches quite long instead of forming a cluster (see, as one example for the latter, Davis Square). Further investigation would be needed in order to figure out what the piece of open land (beige) behind the sport facilities is and whether any use could be made of it.
When we zoom a little bit more into the place, we use a scale of 1:1,200 (approximately 3.5 feet +/- accuracy) , the red strip on the picture gives the impression of quite a big area. However, the red strip mostly covers only the area where the tracks are going. Apart of that, there is, actually, not that much space left. Also, there are only very few buildings that seem to be inside the commercial use-zone. In closest proximity to the new station it would be only the three buildings below it.
Although, we might get an idea of what the land is dedicated to, these maps do not show any more precise information of different types of commerce. If we wanted to know what kinds of shops, grocery stores or restaurants we could find in this area, the MassGiS layer would not help us in that, because it does not allow any further differentiation in the category of commerical use (unlike, for instance, with residential use). Thus, the land use layer is too superficial to make any deeper conclusions on how many shops there are, what sizes they have and which chains are located here. In order to do so, we would need further information, e.g. getting business data through geocoding to make some assumptions on the existing and potential commercial activity in this area.
Considering the positional accuracy we need to say that the different zones are quite imprecise. In the top of the latter picture we see that two different land uses, institutional and commercial, cut through the buildings owned by Tufts. Therefore, the map does not represent the actual status of the building. Also, in the lower part of the picture we see that the red strip makes a curve where there is no further area for commercial use available, but the Tufts sport facilities. Thus, on this scale we encounter some errors.
With respect to the currency, it needs to be said that the data was revised in 1999 the last time. Though, most of the uses probably have not changed over time (e.g. institutional, recreational and residential), it would, naturally, be better to find more current data. One feature that co uld be interesting for a time comparision is that one could see how the land uses have changed since 1971. This could be especially interesting, if land uses are going to be changed again in order to implement the new station in 2020.