IL2009: Optimizing Web Presences: SEO & Metrics
Speakers: Marshall Breeding, Andrew White, Joseph Balsamo
Marshall Breeding started the talk with his presentation, titled “SEO: Optimizing Library Web Resources for Enhanced Discovery.” Some of the digital object management systems do not interface well with search engines. We need to follow all of the techniques that have been established in the eCommerce world for optimized discovery of and access to our library resources. “The more that you try to scam the system, the more likely you are to disappear.” If you break Google’s rules, they will likely erase you and not crawl you again. Even if you do all the right things, the site won’t harvest all of your pages. Marshall’s sites are being indexed at a rate of about 90%, though he used to be satisfied with 60-70%. Marshall’s Library SEO Cookbook ideas! Use your analytics to establish initial performance benchmarks, develop content, create metadata, publish the content, optimize content delivery, use sitemaps to facilitate search engine indexing, and then finally benchmark and fine tune your content and metadata. Analytics is understanding the use patterns of your web content. You also need to understand the benchmarks to document hte impact of your SEO techniques. You should have specific goals for your websites, not just page views numbers. Marshall recommends Google Analytics to start with – it’s easy to implement and returns excellent statistics. Where are people coming from within the various search engines? What keywords do they type in to get to your website? As an example of change, Marshall noted that he’s noticing a growing number of users coming through Bing…much more than were coming through MSN’s Search. The goal is to get more traffic to your website through the search engines, and you can do that by helping them find your content. What are libraries good at? Having high quality content. But we need to provide clean structure and strategic metadata. You will have a product for a web-based delivery system like ContentDM or other CMSs. You can have one unique page for every object in the repository. You can have a permalink. Minimize the navigational stuff because it confuses the search engines. You need to have a <title> tag that is unique for every item. You need to have a description that is about 150 words, and that content is what makes the search results description show up as it does. You also need to generate a site map. SiteMapProtocol.org is used by all search engines now, but was originally proposed by Google. Marshall uses a PERL script to create an automatic site map for his site every day. If you use Google’s site map tool, you can submit your site maps and it will tell you how they’re doing – how many URLs are indexed out of how many you have, what words get people to the site. The important thing is to monitor and maintain — keep making incremental changes to make your site more discoverable by search engines.
Andrew and Joseph ‘s presentation was titled “Virtual Tachometer.” Search metrics are important for collection development, website redesigns, and deciding what new types of virtual services you want to provide. It can help you justify the decisions for new projects. They mentioned COUNTER-compliant standards — standards that database vendors are supposed to use when reporting resource use. They did a comparison between Splunk, Google Analytics, and Woopra. Google Analytics lets you look at statistics over various time periods of your choice, look at page visits, visitor numbers. Google Analytics also shows you where your visitors are coming from by country, province, city, etc. through recognition of IP addresses. What types of searches are used when people get to your site — you can see what words people use to describe your services (Sarah’s Comment: Also, you can see what is missing — what do you have that people aren’t searching for, or what words do you use that your users don’t use…it’s important to analyze what _is_ in your statistics, but also what _is not_ there.) Google Analytics has a more complete mapping functionality, more powerful visualization tools, a better drill-down so you can create customized reports, and advanced network segments. Woopra gives you percentage of new visitors, times of visits and page views, browser use, etc. Woopra also shows the breakdown of what countries people are coming from, but not as granular as the map services in Google Analytics. Woopra does have live stats, something that GA does not have. Within live stats, you can tag a user based on a particular browser, screen resolution, location, etc. You can then track that user’s activities over time. Woopra uses Maps by Google in their live system (that’s odd, no?). Splunk is a free-form log analysis tool. Unlike Google Analytics and Woopra where you have to install the piece of code on your web content & those companies do the analysis for you, in Splunk you can take the logs off of your server yourself and do your own log analysis. There are a bunch of plug-ins that work too so you can look at Apache and other languages/tools’ statistics. Looking at the top sites that refer users to their site, they could see that Serial Solutions provided a lot of referrals. Splunk also provides similar bar graphs just like in Woopra and Google Analytics. It builds the analysis in real time as it processes the log file. You can set up custom screens, custom reports, and custom search points. Splunk does seem to be a significant memory & processing hog.