I wanted to start building a tool to model the impact of wind speed on the output of the UK’s wind farms. There are the three principle components for this; the location of the wind farm, accurate weather data for that location and a generation function that calculates the output of the wind farm based upon physical and environmental conditions.
The first two elements can be professional sourced but I wanted build this tool using publicly available information as far as possible. This post is describes the sourcing of physical data about the wind farm.
Sourcing all wind farm data
RenewableUK is a great source of renewable power information and they have detailed information about every wind farm on their website. However, rather frustratingly they keep that behind a members only link but thankfully, they do publish an up-to-date table that includes all the data – longitude, latitude, type and capacity – we need on their site.
As of June 2012, the data is presented in a simple HTML table and thus using similar techniques to those used with BMRA prices we can easily write some code to scrape this information out.
Coding and demonstrating the extraction
The code presented here is a stem from which further code could be appended to; we might want to save the information into a separate file (as we did with the oil prices), upload them to a data source or retain internally for further analysis and these functions could easily be added.
I’m using Python 2.7.x and BeautifulSoup to handle the HTML parsing and full installation/configuration instructions are given from their respective sources. Additionally, the source data table has a fixed width of 13 elements but the number of rows is variable which will change when new units are commissioned or decommissioned.
The complete code is shown below. You’re welcome to use it as you wish but please attribute back to this site.
“””Locates and scrapes Wind Farm data”””
__author__ = “Patrick Avis”
__email__ = “email@example.com”
from BeautifulSoup import BeautifulSoup
def pairs(l, n):
return zip(*[l[i::n] for i in range(n)])
url = urllib2.urlopen(“http://www.bwea.com/ukwed/operational.asp”)
soup = BeautifulSoup(url)
table_extract = soup.findAll(‘table’)
rows = table_extract.findAll(‘tr’)
outputwind = 
for tr in rows:
while i < len(rows):
cols = tr.findAll(‘td’)
for td in cols:
text = ”.join(td.find(text=True))
text = str(text.strip())
all_wind_farms = pairs(outputwind,13)
for windfarm in all_wind_farms:
if __name__ == “__main__”: