We are removing two sections from our site.
/warehouse/
/clothing/
I'd like to send all the URLS beneath these two to a single (404) landing page saying the item has been removed. I'd like to clean up the query strings too if possible.
Where do I start?
2 Answers
Answers 1
If you're using nginx, you can just add a pair of location
sections. They'll match as long as there aren't more specific locations. Check out the documentation for more detail.
location /warehouse/ { return 410; } location /clothing/ { return 410; }
If there are too many locations, it could be cumbersome to list them separately, so you can use regex like this:
location ~* ^/(warehouse|clothing|something-else)/ { return 410; }
If you want a customized 410 page, add configuration like this in your server
block:
error_page 410 /410.html; location = /410.html { root /var/www/error/; # Put a file /var/www/error/410.html internal; }
Replace 410 with 404 if you want to return that status code. I believe 410 "Gone" is more appropriate answer, but YMMV.
I'd suggest to do this in whatever is closer to the client, so if nginx is in front of Apache - do it with nginx. This way you have less round-trips.
If you want to do this in Apache, you can do it with RedirectMatch
:
// I'm not sure `.*$` part is even necessary. Can be probably omitted. RedirectMatch gone "^/(warehouse|clothing)/.*$" "/410.html"
Or I'd suggest to use mod_rewrite as a somewhat more flexible option:
RewriteEngine on RewriteRule ^/(warehouse|clothing)/ - [G,L] ErrorDocument 410 /410.html
Here [G]
means "gone" (410 status code). If you want a 404 response, do this instead:
RewriteEngine on RewriteRule ^/(warehouse|clothing)/ - [R=404,L]
Note, that you need ^/
in your regexes to indicate that path not just contains /warehouse/
or /clothing/
but starts with those. Otherwise you'll see suposedly incorrect responses on addresses like /about/clothing/
. I'm not exactly sure if you need trailing .*$
, but I believe you don't. Don't have Apache to test this. Add it if rules don't work for you (i.e. ^/(warehouse|clothing)/.*$
).
Or you can handle the logic in your application - which can be the only way if your base layout contains something user-dependent and you want consistency. No answer could be written without knowing what language/framework/stack do you use.
Answers 2
First, I'd recommend that you redirect to a 410 (Gone) rather than a 404 to acknowledge that the resource once existed.
In Apache, you'd do something like the following. Refer to this page for more information.
RedirectMatch permanent "^/(warehouse|clothing)/?.*" "http://www.example.com/404"
In IIS, your web config would look something like the following. Note that IIS won't let you use question marks in your regex, since it interprets that as a query string. Refer to this page for more information.
<?xml version="1.0" encoding="UTF-8"?> <configuration> <system.webServer> <rewrite> <rules> <rule name="404 Redirect" stopProcessing="true"> <match url="^/(warehouse|clothing)/" /> <action type="Redirect" url="404" appendQueryString="true" redirectType="Permanent" /> <conditions trackAllCaptures="true"></conditions> </rule> </rules> </rewrite> <httpProtocol allowKeepAlive="false" /> <caching enabled="false" /> <urlCompression doDynamicCompression="true" /> </system.webServer> </configuration>
Updated to include ^/
at the beginning of the regex based on drdaeman's comment.
0 comments:
Post a Comment