out of time
03 March 2010
The moment has arrived: after three release candidates and lots of great feedback from Sunspot users and contributors, Sunspot 1.0 and Sunspot::Rails 1.0 are out. It’s got a bunch of exciting new features, and I’m going to tell you all about them. You know how I do.
There’s been a bit of confusion regarding the status of the Sunspot::Rails project. I shall clarify in the form of a brief bulleted list of facts:
- Sunspot and Sunspot::Rails now live in the same git repository.
- Going forward, Sunspot and Sunspot::Rails will be released in tandem, with matching versions. Sunspot::Rails will always have a dependency on its corresponding Sunspot version.
- Sunspot and Sunspot::Rails are still separate gems.
Hope that clears things up.
Upgrading to Sunspot 1.0
Sunspot 1.0 uses the totally awesome Solr 1.4 release; it’s the first Sunspot version that does. This opens up a lot of new functionality, which we’ll get to later, but it also means that you’ll need to upgrade any production Solr instances you’re running to Solr 1.4.
Sunspot 1.0 introduces the
sunspot-installer executable, which modifies an existing Solr installation’s configuration files to work with Sunspot. Its main purpose is to ease the pain of upgrading – if you’ve got a schema that’s built for an older version of Sunspot, then simply run
sunspot-installer my/solr/home and it’ll work with the latest and greatest. For instance, if you are using Sunspot::Rails, upgrading is as easy as this:
$ rake sunspot:solr:stop $ sudo gem install sunspot_rails -v '1.0.0' $ sunspot-installer -v ./solr $ rake sunspot:solr:start
If you’ve got a fresh, default Solr installation that you’re planning on using with Sunspot, pass the
-f flag to
sunspot-installer, which will simply copy Sunspot’s schema and solrconfig over the packaged ones; this works better because Solr’s example schema uses naming conventions that conflict with Sunspot’s.
And lastly, once you’ve modified your search configuration to take advantage of Sunspot 1.0 and Solr 1.4’s cool new capabilities, you’ll almost definitely need to reindex. Nobody said life was easy.
What’s new in Sunspot 1.0
This is easily the most exciting new feature in Solr 1.4, which is why it’s first. Imagine the following situation: every Post belongs to a Category. You would like to build a search UI where your users can select one or more categories, along with some other filtering options. With the faceting functionality we all know and love, this simply isn’t possible: once the user selects a category, the categories facet will only show values that match the search results, which in this case is a single value: the category they have already selected.
What we’d really like to do is, for the purposes of calculating the available categories in the categories facet, ignore the fact that a category is already selected. And that’s exactly what multiselect faceting lets us do. Here’s how:
Sunspot.search(Post) do keywords('some keywords') with(:blog_id, 1) category_filter = with(:category_id, 3) facet(:category_id, :exclude => category_filter) end
What’s going on here is that the
with method now has a meaningful return value – essentially a handle to the filter it generates (it doesn’t matter what the object it returns actually is). That handle can then be passed to the new
:exclude option in the
facet method, which instructs Solr to ignore that filter for the purposes of calculating that facet’s rows. And there you have it: multiselect faceting.
Named Field Facets
Field facets can have lots of options applied to them, not least the awesome one we just talked about. And especially now that we have that last option, one can easily imagine situations in which you’d want to build a facet on the same row more than once, with different options.
Enter named field facets. They’re pretty simple to use:
Sunspot.search(Post) do keywords('some keywords') blog_filter = with(:blog_id, 1) facet(:category_id) facet(:category_id, :name => :all_blog_category_id, :exclude => blog_filter) end
In this case, we’re constructing two different facets on the
:category_id field: the first is a normal facet giving us all the categories that match our search, and the second gives us all the categories that match our search across all blogs. The
:name parameter is also what you use to retrieve the facet from the search:
It’s well-known that Solr commits are very expensive and shouldn’t be done on a frequent basis. That means that if you have an application whose data changes frequently, keeping the Solr index completely up-to-date all the time is more or less impossible. Sunspot 1.0 eases the pain of that by introducing “assumed inconsistency”; in particular, if the search results reference an object that doesn’t actually exist in the primary data store, Solr will just quietly drop it from the results.
This behavior is automatic when using the
Search#results method. If you use the
hits method, by default the data store isn’t queried at all, so Sunspot has no way to double-checke that the referenced results actually exist. Tell Sunspot to do that check by passing the
:verify => true parameter into the
The same logic applies for instantiated facets – by default, the referenced instances aren’t loaded until the first time you ask for one, after the collection of rows has already been built. Just pass
:verify => true into the
rows method to make sure that all of your instantiated facets reference actual instances.
New Field Types
Sunspot 1.0 introduces several new field types – some of them were available in Solr 1.3 and just not supported by Sunspot, and some are new in Solr 1.4.
First, the not-so-new types:
double. They’re what they sound like – bigger versions of
In the department of more exciting features, Solr 1.4 and Sunspot 1.0 introduce new “Trie” field types. These are numeric types – they store integers, floats, or times – but they index them in such a way that range queries are much faster. So, if you’ve got numeric or date fields that you do range searches (which for our purpose includes
less_than as well as
between), supercharge your efficiency by defining them like so:
Sunspot.setup(Post) do integer :comment_count, :trie => true float :average_rating, :trie => true time :published_at, :trie => true end
Note that in Sunspot 1.0, lat/lngs are stored in Trie fields as well, so you’ll definitely want to reindex your data if you’re using geo search. And speaking of geo search…
Out with LocalSolr, in with solr-spatial-light
Previous versions of Sunspot have not allowed searches to be performed with both a fulltext component and a local component, because LocalSolr clobbers various Solr features in such a way that, as far as I can tell, it’s not possible. After trying out the trunk version of LocalSolr, which did fix one problem but then introduced another, I decided to just do it myself. The result is solr-spatial-light, a very small Solr plugin that exposes the lucene-spatial library in Solr.
From Sunspot’s standpoint, the API has changed a bit (although it’s still mostly backward-compatible):
Sunspot.search(Post) do near([40,5, -72.3], :distance => 5, :sort => true) end
Aside from playing nice with keyword search, another advantage of solr-spatial-light is that you can sort by distance without limiting results to a certain radius; thus, both the
:sort options are optional. Of course, if you don’t pass in either, you won’t do much other than make your search slower.
New Session Proxies
The SessionProxy pattern, first appearing in a recent Sunspot::Rails release, now gets first-class support in the core Sunspot library. A SessionProxy is simply an object that presents the same API as Sunspot::Session, adding behavior to the core Session functionality. There are lots of potential applications for this pattern (some of which you’ll probably see in future releases), but this release ships with three:
MasterSlaveSessionProxy are automatically injected by Sunspot::Rails (the latter only if
config/sunspot.yml contains a master solr configuration). If you wish to manually inject a session proxy, simply use the
Sunspot.session = ThreadLocalSessionProxy.new(Sunspot.configuration)
Support for class reloading
Sunspot now explicitly supports class reloading of the type done in Rails, Sinatra, etc., for classes that are set up for search. As well as yielding more consistent behavior with class reloading, this fixes a development-only memory leak.
Deletion by query
I can’t really think of a use case for this, but it seemed like a cool feature: you can now remove documents from Solr using a query; just use the same DSL as you would in a search:
Sunspot.remove(Post) do with(:blog_id, 1) end
What’s new in Sunspot::Rails 1.0
No more shelling out to start/stop Solr
Sunspot 1.0 introduces a real
Sunspot::Server class, which manages starting and daemonizing the embedded Solr instance.
Sunspot::Rails::Server now simply subclasses
Sunspot::Server, which means it doesn’t have to shell out to the
sunspot-solr executable. Big plus for environments like Bundler where the gem executables aren’t necessarily in the
Different logic for spec support
Recent versions of Sunspot::Rails have automatically disconnected Solr during specs unless it was specifically enabled; Sunspot::Rails 1.0 reverses this, allowing explicit disabling of Solr in specs. Another change here is that if Solr is disabled in specs, all Solr operations are stubbed, including searches.
Here’s how to disable Sunspot in your specs:
describe Post do describe "without search" do disconnect_sunspot it "should work awesome" do #... end end end
Logging of Solr requests in Rails log
You can now log all Solr requests in the Rails log, with pretty coloring and everything. Do do this, put
require "sunspot/rails/solr_logging" in an initializer.
New Searchable::index method
reindex class method on ActiveRecord models is joined by
index, which re-adds/updates all of the instances of that model to Solr, but doesn’t clear out the index first.
That’s all for now
I’m very excited to have Sunspot 1.0 out there, but this certainly isn’t the end of the road. Some ideas that have been bouncing around the mailing list and may make appearances in future 1.x releases:
- Spell checking
- Facet prefixes
- Search auto-suggest