问题描述
奇怪的是,我正在尝试阅读 <Head>许多不同网站的部分,一种特定类型的服务器,Apache,有时会给出代码 403 禁止.并非所有 apache 服务器都这样做,因此它可能是配置设置或服务器的特定版本.
An odd one, I'm trying to read the <Head> section of a lot of different websites out there, and one particular type of server, Apache, sometimes gives the code 403 forbidden. Not all apache servers do this, so it may be a config setting or a particular version of the server.
然后,当我使用网络浏览器(例如 Firefox)检查 url 时,页面加载正常.代码如下所示:
When I then check the url with a web browser (Firefox, for example) the page loads fine. The code sorta looks like this:
var client = new WebClient();
var stream = client.OpenRead(new Uri("http://en.wikipedia.org/wiki/Barack_Obama"));
通常,403 是访问权限失败之类的事情,但这些通常是不安全的页面.我认为 Apache 正在过滤请求标头中的某些内容,因为我不想创建任何内容.
Normally, a 403 is a access permission failed sort of thing, but these are normally unsecure pages. I'm thinking that Apache is filtering on something in the request headers since I'm not bothering to create any.
也许对 Apache 有更多了解的人可以给我一些关于标题中缺少的内容的想法.我想保持标题尽可能小以最小化带宽.
Maybe someone who knows more about Apache can give me some ideas of what's missing in the headers. I'd like to keep the headers as small as possible to minimize bandwidth.
谢谢
推荐答案
尝试设置 UserAgent 标头:
Try setting the UserAgent header:
string _UserAgent = "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; .NET CLR 1.0.3705;)";
client.Headers.Add(HttpRequestHeader.UserAgent, _UserAgent);
这篇关于System.Net.WebClient 请求获得 403 Forbidden 但浏览器不使用 Apache 服务器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!