Apache LocationMatch Regex

Crollster picture Crollster · Jul 31, 2014 · Viewed 45.3k times · Source

My Problem

I need to have Apache HTTP Server (v2.4.10) proxy requests to Tomcat for dynamic applications, which not only do not match the path in Tomcat, but are also have similar paths to one another. For example:

/products/<category>/<sub-category>/<sub-sub-category>/<product-id>.html proxy to: http://mycluster/pf/<product-id>.html

...and also...

/products/<category>/<sub-category>/<sub-sub-category>/<anything-not-ending-in-html> proxy to: http://mycluster/search/<anything-not-ending-in-html>

My Attempts

I'm trying to use LocationMatch regex to handle this, but am not being fully successful. The following LocationMatch regex works on its own (proxy the *.html request to <tomcat>/pf/*.html):

<LocationMatch ^/products/(?<cat>.+)/(?<subcat>.+)/(?<subsubcat>.+)/(?<partnum>.+).html>
ProxyPass balancer://mycluster/pf/%{env:MATCH_PARTNUM}.html
ProxyPassReverse balancer://mycluster/pf/%{env:MATCH_PARTNUM}.html
</LocationMatch>

This passes URLs using the following example path: /products/aaa/bbb/ccc/ddd3456.html (which is correct)

However, when I also enable the regex below:

<LocationMatch ^(?!.*\.html$)/products/(?<cat>.+)/(?<subcat>.+)/(?<subsubcat>.+)((/?)|(./*))$>
ProxyPass balancer://mycluster/search/
ProxyPassReverse balancer://mycluster/search/
</LocationMatch>

Trying to access /products/aaa/bbb/ccc/ results in the 404 page. Here I'm expecting any requests to "/products/aaa/bbb/ccc/" that do NOT end in .html to be passed to /search/ (including any subsequent path info to be included: eg .../search/compare )

My Question

I can't quite figure it out what is wrong. According to Rubular the supplied regex is correct:

What am I missing here?

I'd appreciate any advice on resolving this!

Answer

Crollster picture Crollster · Aug 1, 2014

It seems the regex is a little too permissive - the scope of .+ within the cat/subcat/subsubcat needs to be constrained slightly. Also there is a slight error in the final expression ("./*"), this should be ("/.*"):

Working LocationMatch:

<LocationMatch ^(?!.*\.html$)/products/(?<cat>([A-Za-z0-9\-\_])+)/(?<subcat>([A-Za-z0-9\-\_])+)/(?<subsubcat>([A-Za-z0-9\-\_])+)((/?)|(/.*))$>
ProxyPass balancer://mycluster/search/
ProxyPassReverse balancer://mycluster/search/
</LocationMatch>