Less is More (Branching WebPages to Increase Relevancy)

Let’s assume I’m looking for “Asynchronous processing support in Servlet 3.0″. I navigate my browser to Google and type “asynchronous support servlet 3.0″. I pick an interesting (arbitrary) article from the search results. Suppose I consider some of the introduction irrelevant to me (I might be an expert on the background concepts). However, the article might still be very interesting. So, what I would like to do is bookmark a page without the introduction and the background concepts. Unfortunately, this web page doesn’t let me remove some of its content. The best thing I could do is submit a comment, so the author might remove some of the content. Most likely the author would just ignore my comment and leave the article as it is. What I’d like to do is tune that article, so that next time I won’t have to search within the page for the relevant information.

Web 2.0 applications involve users to improve their website. For example Wikipedia allows everybody with knowledge on a subject to improve the quality of its content. In the case of Wikipedia, the joint effort of a group of people results in valuable content. Whenever a arbitrary individual feels like the content of an article is incorrect, he is not restricted to modify its content.

Most web pages need some sort of restriction. For example, you wouldn’t want your customers to change the contents of your web store like they change articles on Wikipedia. However, deleting content doesn’t make all information incorrect, it just hides a part of the information. In some cases it might even hide information that is irrelevant to the end user. Even better, something I consider unuseful might also be unuseful to my neighbor (saving a local search optimization).

So, deleting information from a page and bookmarking it, might help a subsequent visitor. Obviously you wouldn’t want your original content to be deleted by a visitor of your website. So giving your visitor the opportunity to make a branch of the web page where he can delete the content could help him to create a more appropriate view. The view -with some traceability to the original document- could then be shared with others who could benefit from it. My guess is that a document with a higher information density is more likely to be found by a search engine. The term frequency nor the number of relevant documents increases, but the information becomes more dense (keyword distance might increase). The weight of keyword within a document might increase as soon as a visitor deletes irrelevant content.

Let’s just hope there will be an API someday to enable web pages to save a local search.

Google TechTalks

Are your unit tests also getting more complicated than production code? Do you feel like you are discovering “Bug Clusters” every once in a while or maybe you’re looking for a new “Distributed Source Code Repository” like Git? Would you like to know more about the latest developments in “Hybrid Transactional Memory”, “MapReduce”, “JSR-203”, “Selenium”, “The Java Memory Model” or “Closures”?
All of these topics are part of the presentations made available by Google called “Google TechTalks”. Generally an expert is invited to talk between 45 minutes and an hour about anything that has to do with technology. New video-presentations are published very frequently. The speakers come from universities, wrote a book or might be exploring new technologies for big companies like Google and Sun Microsystems.
Most of the video’s provided, offer decent quality. At the time of recording, people take into account that the presentation will be published on the internet. The slides are clearly readable and sometimes the camera focuses the speaker. The videos are searchable via: http://research.google.com/video.html.

Multi-core

Clock rates don’t increase anymore, instead systems are equipped with multiple CPU’s. To make optimal use of the resources, a program has to be divided over multiple CPU’s. It becomes a lot harder to predict the flow of a program because it’s not always divided equally over multiple CPU’s. What does this mean for a Java developer in a multi-core era? That’s what I discussed with Gil Tene (CTO) and Cliff Click (Chief JVM Architect) from Azul Systems and Shay Hassidim (Deputy CTO) from GigaSpaces Technologies.
Ten years ago we started with a 350MHz CPU. About two years later my parents bought a second computer with a 600MHz CPU. They almost doubled their speed for the same price as two years before. This example illustrates Moore’s law more or less. A program that ran a year before on the 350MHz CPU, could now be executed almost twice as fast on the new machine. Unfortunately chip manufacturers concluded that they can’t increase clock rates anymore under normal circumstances. Instead of increasing clock rates, machines are now equipped with multiple CPU’s. Azul Systems offers Java systems with 864 CPU’s on one machine [Azul09]. Besides these powerful machines, most modern laptops have multiple CPU’s onboard. According to Azul Systems “Not everybody will have to deal with parallel programming. However in the future virtually all developers will to have to be aware of multi-core environments, and change their thinking to match” [Azul09].
The pro of a multi-core is that it exposes more resources. The big disadvantage is that the free performance lunch over [Sutter05] [Herlihy08]. According to Azul Systems “Lots of the scaling issues depend on the nature of the application. A multi-core system has more computing resource but the usage of the cores depends on how the software is written” [Azul09]. Obviously multiple independent programs can be executed in parallel. However, that’s not always the expected performance increase of the customer. A program that needs more performance than the capacity of one CPU, will only run faster on a multi-core system if the developer has optimized it for a multi-core environment. This means the developer has to coordinate how the program has to be divided over multiple CPU’s. According to Gartner parallel programming is one of the biggest challenges the software industry is facing at this moment [Gartner08] [Patterson08].

Read the full article at Java Developer in a Multi-Core Era