David Hurth's blog

Adding Search with Lucene

Tagged:  

What if you are making a new web application that needs to search through files instead of just web pages. Well, you could write your own solution or you could use an existing search engine like Lucene.

Below is an excerpt from the Step Three: Profit! post about using Lucene.

There are a number of considerations to make when adding search to your site. For instance, you can usually get by pretty well with just integrating Google search into your website. This is fast, easy, and doesn't require messing with your backend code at all.

However, this is not really what I want. I want to let users search for files, not web pages, and I want the results integrated nicely with everything else. For instance, it would be cool to use a search query as a radio playlist like you can do on Hype Machine. So I'll need to build my own search engine.

This is not really that hard to do. I would recommend you read some articles and then download Managing Gigabytes for Java. Those articles are by Tom from AudioGalaxy. You may remember AudioGalaxy as the best thing to happen and unhappen to music in my lifetime. I know do. More importantly, it was deliciously scalable and for the most part it was just a search engine. So don't go writing one without learning some tips from the best.

I'm sure that a little engineering and MG4J could produce a highly scalable search engine. However, I didn't really want to spend that much time on it, so I went with a higher level solution in the form of Lucene for Java. There is also a popular version for Python. I would recommend waiting a while if you're considering using Lucy (Lucene in C with Python and Ruby bindings) because I don't consider it mature. I'd also stay away from layers on top of Lucene like Solr because if you're looking for tools to make Lucene easier to use then you're missing the point that it's already easy to use.

You can read more about using Lucene as a web server here.

So, if you are looking for a good search engine that can search for files check out Lucene.

jsLex

Tagged:  

Since Ajax applications are so popular we developers are always looking for better JavaScript tools. Well, jsLex is an eclipse plug-in that attempts to make it easier to improve JavaScript/Ajax performance.

Below is an excerpt from the jsLex site.

Why jsLex?

As Ajax is used for an increasing number of projects it is also being used in projects with larger and larger code base. When the code base of a project increases, chances are you will run into performance issues.

Working on the Apache XAP project and Nexaweb's Universal Client Framwework, which has a large code base it became apparent that hand injection of profiling code only worked so well. A new technique was needed to make it possible to capture a complete view of the performance of an Ajax application in a similar way to using jProfile for Java code.

jsLex is that tool!

  • Go to the features page for more information. Click Me!
  • Go to the screen shots page to view screen shots of the application. Click Me!

jsLex gives developers a complete picture of the performance issues with-in thier Ajax application. By auto injecting profiling code using jsLex ant task, developers don't need to modify their code to track down bottlenecks, minimizing coding erros and saving time.

Currently developers need to hand inject profiling code into their applicaiton. Though this can be easily done as in the example below. Such a technique offers limited information and requires the developer to know percisely where the problem is located. Otherwise developers need to modify their code over and over as the try to find the problem area(s).

function doLongTask(){
  var start = new Date(); 

  for (var index = 0; index < infinite; index++){
     //Do something here
  }

  end = new Date(); 
  alert ("doLongTask took " + (end - start) + 
         " milliseconds to complete." );
}

You can read more about jsLex here.



I think that jsLex makes a good attempt of improving performance profiling of Ajax applications. Also, since the tool plugs-in to the extremely popular Eclipse IDE it can be very useful to many developers. If you have used jsLex, I would love to hear your experience with the plug-in.

How to Load In and Animate Content with jQuery

Tagged:  

If you are interested in writing an application with jQuery, but have not used the library before then you are probably looking for a few good tutorials. Over at NETTUTS they have put together a nice tutorial about loading and animating content with jQuery.

You can read an excerpt of the tutorial below.

Step 1

First thing's first, go and download the latest stable release of jQuery and link to it in your document:

<script type="text/javascript" src="jQuery.js"></script>

One of the best things, in my opinion, about jQuery is it’s simplicity. We can achieve the functionality described above coupled with stunning effects in only a few lines of code.

First let’s load the jQuery library and initiate a function when the document is ready (when the DOM is loaded).

$(document).ready(function() {
	// Stuff here
});

Step 2

So what we want to do is make it so that when a user clicks on a link within the navigation menu on our page the browser does not navigate to the corresponding page, but instead loads the content of that page within the current page.

We want to target the links within the navigation menu and run a function when they are clicked:

$('#nav li a').click(function(){
	// function here
});

Let's summarize what we want this function to do in event order:

  1. Remove current page content.
  2. Get new page content and append to content DIV.

We need to define what page to get the data from when a link is clicked on. All we have to do here is obtain the 'href' attribute of the clicked link and define that as the page to call the data from, plus we need to define whereabouts on the requested page to pull the data from - i.e. We don't want to pull ALL the data, just the data within the 'content' div, so:

var toLoad = $(this).attr('href')+' #content';

To illustrate what the above code does let's imagine the user clicks on the 'about' link which links to 'about.html' - in this situation this variable would be: 'about.html #content' - this is the variable which we'll request in the ajax call. First though, we need to create a nice effect for the current page content. Instead of making it just disappear we're going to use jQuery's 'hide' function like this:

$('#content').hide('fast',loadContent);

The above function 'hides' the #content div at a fast rate, and once that effect finished it then initiates the "loadContent" function (load the new content [via ajax]) - a function which we will define later (in step 4).

Step 3

Once the old content disappears with an awesome effect, we don't want to just leave the user wondering before the new content arrives (especially if they have a slow internet connection) so we'll create a little "loading" graphic so they know something is happening in the background:

$('#wrapper').append('<span id="load">LOADING...</span>');
$('#load').fadeIn('normal');

Here is the CSS applied to the newly created #load div:

#load {
	display: none;
	position: absolute;
	right: 10px;
	top: 10px;
	background: url(images/ajax-loader.gif);
	width: 43px;
	height: 11px;
	text-indent: -9999em;
}

So, by default this 'load' span is set to display:none, but when the fadeIn function is initiated (in the code above) it will fade in to the top right hand corner of the site and show our animated GIF until it is faded out again.

Step 4

So far, when a user clicks on one of the links the following will happen:

  1. The current content disappears with a cool effect
  2. A loading message appears

Now, let's write that loadContent function which we called earlier:

function loadContent() {
	$('#content').load(toLoad,'',showNewContent)
}

The loadContent function calls the requested page, then, once that is done, calls the 'showNewContent' function:

function showNewContent() {
	$('#content').show('normal',hideLoader);
}

This showNewContent function uses jQuery's show function (which is actually a very boring name for a very cool effect) to make the new (requested) content appear within the '#content' div. Once it has called the content it initiates the 'hideLoader' function:

function hideLoader() {
	$('#load').fadeOut('normal');
}

We have to remember to "return false" at the end of our click function - this is so the browser does not navigate to the page

It should work perfectly now. You can see an example of it here: [LINK]

Here is the code so far:

$(document).ready(function() {

    $('#nav li a').click(function(){
    
    var toLoad = $(this).attr('href')+' #content';
    $('#content').hide('fast',loadContent);
    $('#load').remove();
    $('#wrapper').append('<span id="load">LOADING...</span>');
    $('#load').fadeIn('normal');
    function loadContent() {
    	$('#content').load(toLoad,'',showNewContent())
    }
    function showNewContent() {
    	$('#content').show('normal',hideLoader());
    }
    function hideLoader() {
    	$('#load').fadeOut('normal');
    }
    return false;
    
    });
});

Step 5

You could stop there but if you're concerned about usability (which you should be) it's important to do a little more work. The problem with our current solution is that it neglects the URL. What if a user wanted to link to one of the 'pages'? - There is no way for them to do it because the URL is always the same.

So, a better way to do this would be to use the 'hash' value in the URL to indicate what the user is viewing. So if the user is viewing the 'about' content then the URL could be: 'www.website.com/#about'. We only need to add one line of code to the 'click' function for the hash to be added to the URL whenever the user clicks on a navigation link:

window.location.hash = $(this).attr('href').substr(0,$(this).attr('href').length-5);

The code above basically changes the URL hash value to the value of the clicked link's 'href' attribute (minus the '.html' extension. So when a user clicks on the 'home' link (href=index.html) then the hash value will read '#index'.

Also, we want to make it possible for the user to type in the URL and get served the correct page. To do this we check the hash value when the page loads and change the content accordingly:

var hash = window.location.hash.substr(1);
var href = $('#nav li a').each(function(){
    var href = $(this).attr('href');
    if(hash==href.substr(0,href.length-5)){
        var toLoad = hash+'.html #content';
        $('#content').load(toLoad)
    } 
});

With this included here is all the javascript code required: (plus the jQuery library)

$(document).ready(function() {
	
    // Check for hash value in URL
    var hash = window.location.hash.substr(1);
    var href = $('#nav li a').each(function(){
        var href = $(this).attr('href');
        if(hash==href.substr(0,href.length-5)){
            var toLoad = hash+'.html #content';
            $('#content').load(toLoad)
        } 
    });
    
    $('#nav li a').click(function(){
    
    var toLoad = $(this).attr('href')+' #content';
    $('#content').hide('fast',loadContent);
    $('#load').remove();
    $('#wrapper').append('<span id="load">LOADING...</span>');
    $('#load').fadeIn('normal');
    window.location.hash = $(this).attr('href').substr(0,$(this).attr('href').length-5);
    function loadContent() {
    	$('#content').load(toLoad,'',showNewContent())
    }
    function showNewContent() {
    	$('#content').show('normal',hideLoader());
    }
    function hideLoader() {
    	$('#load').fadeOut('normal');
    }
    return false;
    
    });
});
You can read the full tutorial here.

jQuery is one of the best JavaScript/Ajax libraries that I've seen, so it is always good to see more tutorials using the library. If you haven't used jQuery before then I would recommend using it and this tutorial is a great place to start.

Title Capitalization in JavaScript

John Resig has written a good JavaScript port of the excellent Perl script written by John Gruber that provides pretty capitalization of titles. It is amazing how well the script works for capitalizing words.

Below is an excerpt from the post.

The excellent John Gruber recently released a Perl script which is capable of providing pretty capitalization of titles (generally most useful for posting links or blog posts).

The code handles a number of edge cases, as outlined by Gruber:

  • It knows about small words that should not be capitalized. Not all style guides use the same list of words — for example, many lowercase with, but I do not. The list of words is easily modified to suit your own taste/rules: "a an and as at but by en for if in of on or the to v[.]? via vs[.]?" (The only trickery here is that “v” and “vs” include optional dots, expressed in regex syntax.)
  • The script assumes that words with capitalized letters other than the first character are already correctly capitalized. This means it will leave a word like “iTunes” alone, rather than mangling it into “ITunes” or, worse, “Itunes”.
  • It also skips over any words with line dots; “example.com” and “del.icio.us” will remain lowercase.
  • It has hard-coded hacks specifically to deal with odd cases I’ve run into, like “AT&T” and “Q&A”, both of which contain small words (at and a) which normally should be lowercase.
  • The first and last word of the title are always capitalized, so input such as “Nothing to be afraid of” will be turned into “Nothing to Be Afraid Of”.
  • A small word after a colon will be capitalized.

He goes on to provide a full list of edge cases that this script handles.

You can read the full post here.

I can see some uses for this script in applications that would like to take user input and make it into a properly formatted title. This could be handy in a del.icio.us or digg like application.

Java The Next App Engine Language?

Tagged:  

Michael Podrazik has written an interesting post on what the next App Engine Language will be. If you don't know about App Engine it currently supports Python and you can read more about it here.

Below is an excerpt from the post.

So how reasonable would it be to offer a hosted Java environment? While almost any hosting provider currently gives you the option of running PHP, lots of 'em give you Perl, etc. virtually nobody except boutique hosting providers let you run Java. There's a good reason for this. First of all, Java is an enterprisey language and the apps that use Java on the server side are not especially well suited to run in a shared environment. Secondly, even if the market existed, there are technical limitations that make running Java in a shared resource pool problematic. While you can chroot PHP to prevent people from accessing the shared, underlying filesystem, with Java you can spawn threads and do lots of other things that make implementing resource consumption quotas problematic. The fact that you can't just run a Java program using an Apache module or through CGI, and the fact that there tends to be a mismatch in the skill sets that *nix ops people usually have and the skill set required to effectively manage a Java app just further muddies the waters.

What you would really need is a customizable JVM that let a hosting provider limit what hosted apps are allowed to do. You may be able to do this with a locked down SecurityManager, but doing the kinds of things that Google is doing with the App Engine Python implementation would be even better. Not very many people have the chops to write their own VM. Google is one of them, and oh yeah, they've already sorta done it. Twice.

GWT is interesting in that you write your applications in Java with certain elements of the API stripped out and the compiler translates your code to JavaScript. Android apps are written in Java with certain elements of the API stripped out and the compiler translate your code to run in Dalvik. Why not do something like this for App Engine and make it a trifecta?

Google is pouring a lot of resources into Java lately and in a public way. Guice is getting lots of attention lately as a Spring competitor.

You can read the full post here.

So, the post predicts that the next App Engine Language will be Java. However, it predicts that the API will be limited. I would love to hear your thoughts on this prediction, so leave your thoughts in the comments.

Clean Ajax

Tagged:  

Clean Ajax is an attempt to make Ajax development easier. Clean Ajax is inspired on the Java Message API and offers a reliable solution to Ajax.

Below are the features that Clean Ajax promises.

Features Provided

Clean focus is on simplicity and speed on development, keeping the focus only on AJAX issues.

It is very important to note that the sense of simplicity does not mean poorness, so Clean is not negligent with AJAX
problems and needs. To accomplish the mission of improve AJAX applications Clean provides:

  • A high level of abstraction, you can use just one facade to work with AJAX, abstracting everything else.
  • Configuration by exception, the messages require mimimum explicit configuration to work.
  • Simple way to customize message's behavior and apply your on logic to them.
  • Multiple request handle, the engine is able to handle multiple requests simultaneously.
  • Exception handle, the engine is aware about exceptions that can occur and how to report them.
  • Trace console to monitor messages life cycle.
  • Cache and history control (new on version 4.1).
  • Message queue used to manage the requests.
  • Garbage collection.
  • Integration with WebServices based on SOAP and XMLRPC protocols.
  • Cross-browser implementation compatible with the major browsers (Internet Explorer, Firefox, Mozilla,
    Opera and Netscape).

You can read more about Clean Ajax here and you can see some nice demos here. To download Clean Ajax here.

Below is the code from the planet example on the demo page.

				function showError(e){
				  alert(e);
				}
				
				function get(url, consumer, progress_bar, cache){
				  var message = Clean.createSimpleMessage(url, consumer, showError);
				  if(cache != null)
					message.cache = cache;
				  if(progress_bar != null){
					var progress = new EmbeddedProgressBar(document, progress_bar);
					message.progressBar = progress;  
				  }
				  Clean.doGet(message); 
				} 
				
				function post(url, consumer, form){
				  var message = Clean.createSimpleMessage(url, consumer, showError);	
				  Clean.postFormByName(message, form, false);
				} 
			

The code used does look very clean and small. I think there can be a lot of good uses for Clean Ajax and look forward to playing with it more.

How Google Friend Connect Works

Tagged:  

You may have been reading a lot about Google Friend Connect, but aren't clear exactly how it works. Well, over at the Google Friend Connect blog they have explained exactly how it works.

Below is an excerpt from the post.

We figured you might be tracking the conversations about Google Friend Connect and Facebook. We want to help you understand a bit more about how it works on the Friend Connect side with respect to users' information.

People find the relationships they've built on social networks really valuable, and they want the option of bringing those friends with them elsewhere on the web. Google Friend Connect is designed to keep users fully in control of their information at all times. Users choose what social networks to link to their Friend Connect account. (They can just as easily unlink them.) We never handle passwords from other sites, we never store social graph data from other sites, and we never pass users' social network IDs to Friend Connected sites or applications.

The only user information that we pass from a social networking site to third-party applications is the user's public photo, and even that is under user control.

That's the high-level view. But what about the details? Here is more information on exactly how Friend Connect interacts with third-party social networks and applications.

  1. Google Friend Connect puts users in control over whether they're connected to their data on Facebook.
  2. Google Friend Connect only reads a small amount of user data from Facebook, and does so using Facebook's public APIs. We read the Facebook numeric id, friendly name, and public photo URLs of the user and their friends. We read no other information.
  3. The only user information that we pass from Facebook to third-party applications is the URL of the user's public photo.
  4. Google Friend Connect does not permanently store any user data retrieved from Facebook.

You can read the full post here.

This is very useful information as I know that I was about confused as to exactly how it worked when I first heard of it.

Writing Your First YUI Application

Tagged:  

If you are interested in creating applications using the Yahoo! User Interface then you may be looking for a good resource to get started. Well, over at O'Relly's Inside RIA they have put together a good post to help you get started.

Below is an excerpt from the post.

Getting Started



YUI consists of several CSS components and nearly three dozen JavaScript components, all of which are designed to empower the rapid creation of web applications that run in all the major web browsers. Yahoo! uses a Graded Browser Support approach in which we "white-list" a subset of browsers that we'll fully support — we call these the "A-Grade browsers," and, taken together, they represent more than 90% of traffic on Yahoo!'s network worldwide. YUI is tested and supported in all A-Grade browsers (including current versions of Safari, Opera, Firefox and Internet Explorer).


ui-language.png

The best way to think about YUI and what it does for you is to consider the difference between the user interface language of the browser as compared with the desktop. In the browser, the "default" language is relatively limited. You have form elements (buttons, select menus, text inputs) and hyperlinks. On the desktop, you expect much more: Tabs, sliders, cascading menus, dialogs, tooltips, data grids, rich text editing, drag and drop, animation, autocompletion, and so on. Here's one way to visualize this difference, with the browser's native UI elements on the left and the richer set of desktop-style UI elements on the right:



In the browser, everything on the right side of this diagram requires some hard work. YUI, like other JavaScript/CSS libraries, aims to make that work less hard.

You can read the full post here.

The Yahoo! User Interface is a great JavaScript library and this will give anybody that wants to learn it a good jump start.

Adobe Flash Player 10 Beta Released

Tagged:  

Adobe has released Adobe Flash Player 10 Beta. This looks very good and has some very interesting features.

Below is the run down of features from Adobe's website.

3D Effects - Easily transform and animate any display object through 3D space while retaining full interactivity.  Fast, lightweight, and native 3D effects make motion that was previously reserved for expert users available to everyone.  Complex effects are simple with APIs that extend what you already know.
Custom Filters and Effects - Create your own portable filters, blend modes, and fills using Adobe® Pixel Bender™, the same technology used for many After Effects CS3 filters. Shaders in Flash Player are about 1KB and can be scripted and animated at runtime.
Advanced Text Layout - A new, highly flexible text layout engine, co-existing with TextField, enables innovation in creating new text controls by providing low-level access to text offering right-to-left and vertical text layout, plus support for typographic elements like ligatures.
Enhanced Drawing API - Runtime drawing is easier and more powerful with re-styleable properties, 3D APIs, and a new way of drawing sophisticated shapes without having to code them line by line.
Visual Performance Improvements – Applications and videos will run smoother and faster with expanded use of hardware acceleration.  By moving several visual processing tasks to the video card, the CPU is free to do more.

You can read more about Flash Player 10 Beta here.

The two features that stand out for me are the Enhanced Drawing API and the 3D effects. This could give some great effects for web pages.

Google Friend Connect

Tagged:  

Later today Google will be launching a new service called Friend Connect. Read Write Web has written an interesting post about the new service raising some concerns.

Below is an excerpt from the post.

You Can't Use it Yet

While the whole developer and publisher world is anxiously awaiting details from the launch tonight, Google is putting a damper on adoption by limiting the Friend Connect "preview release" to a handful of white listed apps and a short list of selected websites. The company says it has to prove it can scale the infrastructure (ooh, can Google scale? I don't know, better limit the approved users to just a tiny handful!) and it wants to see what kinds of features developers and site owners want to request. Apparently the company believes this feedback is best done by making said parties look from the outside and send emails guessing about what they'd like to see once they are let inside. This seems completely backwards to me.

You Can't Touch What's Inside the Magic Box

Site owners will be able to add Open Social apps to their web pages - sort of. They'll be able to display them inside an iframe, a separate web page inside a webpage. They won't be able to leverage that user data to change what they deliver themselves to their users.

Apps in an iframe may as well be a social sidebar ala Flock or Yoono. Those collections of social apps are probably more useful anyway.

Conversations Are Complicated

Google made it clear during their press call that they are aiming for the easiest, simplest and safest way to enable social apps to be integrated into other websites. It will take less than six months, they promise.

Let's be clear that it's not going to be easy to figure out how to enable all this user data to be mashed up in acceptably safe ways. We asked Google how they can assume that one user's friends on IMeem have permission to access their info out on other sites around the web. They said that users will have to be given the option whether to expose that info to third party sites or not, something we haven't seen any details on yet from the original source social networks. That would be even more difficult if the destination sites had read, much less write, access to that ported-in social networking data.

You can read the full post here.

While the post raises some interesting concerns, I think that the service may turn out better than the post states. While I think the initial release of the service may have some issues, Google has a proven track record that bodes well for the service over time. We can only wait and see what happens.

Syndicate content