remove jsessionid in url rewrite in spring mvc

mmohab picture mmohab · Mar 11, 2011 · Viewed 38.7k times · Source

I am using spring MVC and having a problem in jsessionid, what I found is that jsessionid is injected in the url if cookies isn't enabled in the browser producing a url like that:

http://localhost/categories;jsessionid=Bsls4aQFXA5RUDcmZKV5iw?cid=13001

Actually there is no problem with browsers but when Google crawl my site, and seems Google crawlers don't have cookies :), they store urls of my site in that form and my site appears in search results having URLs like that ones containing jsessionid.

Actually it's running without any problems, but I prefer to have URLs appear in Google search results clear without jsessionid.

Any help?

Answer

BalusC picture BalusC · Mar 11, 2011

To the point: simply don't let your app create sessions as long as users do not login or perform POST actions. Do not call request.getSession() or request.getSession(true). Do not create nor manage session scoped beans for non-logged-in users. Ensure that the frameworks which you're using do not unnecessarily create sessions without that you say it to do so.

If this is really impossible due to the way your application is designed or due to the limitations/bugs of the (MVC) frameworks used, then your best bet is to redirect Googlebot requests to URLs without JSESSIONID identifier. You can use Tuckey's URL rewrite filter for this (which is, say, the Java variant of Apache HTTPD's well-known mod_rewrite). Here's an extract of relevance from its configuration examples page.

Hide jsessionid for requests from googlebot.


<outbound-rule>
     <name>Strip URL Session ID's</name>
     <note>
         Strip ;jsession=XXX from urls passed through response.encodeURL().
         The characters ? and # are the only things we can use to find out where the jsessionid ends.
         The expression in 'from' below contains three capture groups, the last two being optional.
             1, everything before ;jesessionid
             2, everything after ;jesessionid=XXX starting with a ? (to get the query string) up to #
             3, everything ;jesessionid=XXX and optionally ?XXX starting with a # (to get the target)
         eg,
         from index.jsp;jsessionid=sss?qqq to index.jsp?qqq
         from index.jsp;jsessionid=sss?qqq#ttt to index.jsp?qqq#ttt
         from index.jsp;jsessionid=asdasdasdsadsadasd#dfds - index.jsp#dfds
         from u.jsp;jsessionid=wert.hg - u.jsp
         from /;jsessionid=tyu - /
     </note>
     <condition name="user-agent">googlebot</condition>
     <from>^(.*?)(?:\;jsessionid=[^\?#]*)?(\?[^#]*)?(#.*)?$</from>
     <to>$1$2$3</to>
 </outbound-rule>