Does anyone know where I can get a csv file of a horticulture database for a project I'm working on? I'm trying to create an easy-to-use online database to help me source plants for my design projects. The project is
Thanks for the posts. I have found a few. I had already looked at the USDA csv, but it's over 200k plants, many of which are weeds or food crop. It would be rather time consuming to go through and pick out the plants significant to landscaping.
I'll check out the links you guys gave. Most of the stuff I have found, people want to sell their database. I'm willing to trade the information with anyone else who has info, so it's not like I'm only looking to receive.
No. It puts it all in one field. So after having to sort thru 200k of plants to find out the ones significant to landscaping, I'd have to transform it. After all that, it is missing all of the relevant criteria to select a plant for a landscape project. It seems to me that it is more trouble than it's worth just to get the scientific/ common name and nothing else.
You should be able to parse the words into separate columns easily. Assuming, for example, that you have things like this in one field:
Pieris japonica
(all words separated by spaces, in other words), do a search-replace, converting all spaces into some useless character like the tilde ~. Then, use the text to columns thing with the tilde as the delimiter. Voila.
I understand parsing text files to csv and then flipping it to a database. I've worked on quite a few databases as my day job is as a computer geek (I'm only a plant geek by night/weekends). The major problem with the USDA csv is that it contains a massive amount of irrelevant data. Beyond that, not all plants fall easily into the example Genus Species Cultivar/Var. For example, Abelia ' Edward Goucher' omits species and has 2 words for cultivar. I've been doing quite a bit of clean-up on the csv files I've been able to get. I'm just dreading tackling the USDA csv with 200k+ entries then eliminating tens of thousands of irrelevant entries and then cleaning up 40-60% that don't easily convert.
Then I end up with tens of thousands of relevant plants with zero information about them. Over time as I use them in design projects I can add data, but I'm still forced to go outside my database to pull relevant information.
I know this is going to be a massive undertaking and an on-going project. I appreciate all the suggestions and if anyone is interested in trading content, get with me at
I do this all day long with grocery data that was assembled by slobs. If I could reach through the phone and grab some of these people by the throat......
Interested in splitting the task? You take half the file and I take the other? Reassemble it later?
I see the link to the query, but just so we're both working on the exact same file, do you want to compress it and email it to me, and let me know what record numbers you're going to work on so I can pick up at a later point? The "public" email address you see here is a working one. Just let me know when you send the file. I don't check this email often because it's mostly there to catch garbage generated by newsgroup exposure.
I downloaded the csv from the USDA. Importing it into Excel, I had to break it in half. first half around 120k lines. I emailed it to you. take a look and make sure I counted right. it may be slightly less than 200k.
Just got it. I think I misunderstood what your version of "clean up the data" meant. I was referring to cleaning up the raw data, which I see necessary for many of the "exceptional" records. But, you're talking about editing out categories of plants. I honestly don't have that much time to work on it that way.
Actually, some additional poking around the USDA site gave me this link:
formatting link
think if this is what you are talking about as far as filtering. I went straight for the download, but I think this will filter out a lot of what I don't want. This gave me 39k entries, plus it's broken out so I have far less database cleanup to do. Now I just have to reorganize it and filter out the weed and food items. Thanks for putting me on the right track. This is much more doable.
HomeOwnersHub website is not affiliated with any of the manufacturers or service providers discussed here.
All logos and trade names are the property of their respective owners.