Digital Claxon

Entries tagged as ‘google’

Stilling Searching…for 500 words

October 22, 2007 · 1 Comment

The second half of John Battelles’s The Search discusses Google’s rise to riches, respect and the women that come along with those things. Ok, it doesn’t discuss the women, but I wanted to spice up this blog post a little bit as not to turn off the few readers I have.

In 2003, Google adjusted its algorithm to provide better search results and filter spam sites. These spam sites were identified by Google’s software found sites that illegitimately paid people through Ad Word’s revenue sharing. These sites paid the site’s owner for ever visitor that clicked on the page. This change in the algorithm affected thousands of legitimate, classifying them as spam and dropping their page rank out of the top 100 results. Going from first to last in searches killed once successful businesses. Businesses found they had to adjust their Web sites for search engine optimization (SEO).

Another chapter talked about allowing deep linking to password protected sites. Some sites, such as the Wall Street Journal, require a subscription to view articles. This affects their search results because the indexing program Google uses cannot read the articles, causing the page to be ignored by the entirety of the Internet. Battelle argues that people will pay for subscription services if enough people point to the article from other Web sites. I highly doubt this. People aren’t willing to pay for music, they aren’t going to pay to view the Wall Street Journal online when there are many other free sources of news.

The problem I see is barriers set up to stop us from gathering information, such as private libraries (like the JHU VPN thing we have to use to get research studies). I’m not entirely sure why universities keep this stuff under lock and key. If they are promoting knowledge and education like I think they are, we should have access to all of this information. Who is it going to hurt if I get my hands on a study about hamsters dancing to bad music or if I find that tobacco leaves are really a great alternative to fiber-glass insulation? Do we not want terrorists to know this stuff? It seems to me that it is a Marxist struggle of the haves (people with the knowledge) and the have nots (people without access to the information). Why do universities shut the average Joe out, but promote education at the same time?

On privacy issues, Google says it will never distribute our personal information. In the news recently, Verizon admitted it turned over customer information to the government. Since the USA PATRIOT Act, the government has had the right to demand information from companies about their users. The act says the government and the copy do not have to let the public know when this happens. I imagine that Google has done this and will never admit to it, despite denying it has happened.

Google tried to remain a private company for a long time, but its size forced it to go public. There was some apprehension with buying tech stocks since the Internet bubble burst, but Google has seen nothing but success since going on the market. Its shares are above $600 at the moment. This could be a sign that tech stocks are back and people are willing to invest in new technology again. Too keep the innovations coming and share-holders happy, Google Labs runs a think tank that has created a variety of computer applications that have made our lives a lot easier. Google Maps and Analytics came out of this thing.

Finally, Battelle talks about the future of search. He thins we’ve come up with searches engines that are about 10% of their potential. I can see why he thinks this. We still get irrelevant search results and much information either isn’t available or requires a password to get to. If these barriers were over come we’d have a modern day Library at Alexandria. The government is actually working on digitizing information at this very moment. I think a collaborative effort would stop different groups from digitizing the same information twice and speed up the process. I think that a lot of thing have already been digitized illegally by copy-right violators, and I think this resource shouldn’t be passed over. The work of these pirates is often of great quality, and documents found through searches should be used to check items off the to-digitize list.

In the near future, I’m hoping we’ll have one, massive online library that will be a central repository for all information. I think we have trouble finding the results we are looking for, even though we know they exist. For example, a search for habits of video game players will give more than just a blog or commercial site, all statistics compiled from scientific studies will be shown and gaps in knowledge will be easily identified for future research.

Categories: Internet
Tagged:

Searching…for something to write 500 words on

October 14, 2007 · No Comments

I just read the first half of John Battelles’s The Search.  The book covers the history of search engines, culminating with our current Google experience.

One of the interesting things Battle discussed was Google Zeitgeist (which has since been replaced with Hottrends).  This tool ranks popular search terms everyday.  This tool goes way beyond the simple tracking of search terms, it tracks the collective thoughts of the Internet.  It shows what is on people’s minds and what information they want to learn about.  It shows that Paris Hilton has peaked; it shows that people are worried about terrorist attacks; it shows people want the hoverboards in Back to the Future 2.  Ok, so I made those up, but conceptually it is true.  Hottrends has sumed up the conscious and feelings of the Internet population  each day since its inception.  This is an incredible marketing tool.  Movie producers should go on this site everyday to see what topic would create the next blockbuster.  Reporters should check it each day to write articles on topics people want to read.  Historians should view it as the exact feelings and state of the culture that day.  This tool is amazing.

The rest of the first 150 pages gives a history of searching on the internet.  In the late 1990s, companies though portals were the thing of the future.  Searches were not given much thought and they worked “well-enough.”   Yahoo and AOL were riding high on their success.  Searches still needed a lot of refinement.

Up until that point in time, searches were typically used in offline databases.  Defining the parameters for these searches was rather easy compared to an internet search.  In the offline search, the database was typically used for a well defined purpose and all possible combinations of searches could be programmed into the database.  There are only so many ways a person can ask for the number of books still in stock in a database.  The database won’t be used for anything elseOn the Internet, this is very different.  There are no defined parameters for an Internet search.  A person can ask for anything.  The problem was figuring out the proper algorithm to bring up the most relevant results.

Remembering search engines of the past (I used hotbot) bringing back poor results. Typing in what you were after didn’t typically provide useful results.  Site owners used metatags to aid search engines in finding their sites, but they had to use the same terms you did to bring up their page in a search engine.  Thinking of alternate search terms became an art. Shortly afterward, spammers got wise and started putting in random words to bring up their site in search engines.

To help curtail this problem Google developed the pagerank system.  Sites with more links pointing at them came back higher in the search results.   Legitimate sites should be well linked.  This system, used in conjunction with the web crawlers that searched and indexed site text, provided amazingly accurate search results.

Categories: Internet
Tagged: