Time, content types and showing the developer (and user) what's possible (HATEOAS)...
Time
Using UNIX time may be easier for a developer, and I do concede that point, but by using UNIX time you're doing a few things:
1) You're exposing some internal logic or data structure of the system in the resource
2) You're removing information on time zone
3) In some circumstances where greater precision is needed you're precluding the use of nanoseconds
I know that some languages don't naturally handle ISO8601 dates with nanoseconds, but I do think that's the best format.
YYYY-MM-DDTHH24:MI:SS.NNNNNNNNNZ
With that format one can keep a deeper knowledge of timezone, of precision, of human readability (and JSON.org does say that JSON is easy for humans to read).
Content Types
Whilst it may be pragmatic to not plumb the depths of custom content types they should at least be mentioned and made compulsory even if they aren't typed to a detailed level of specificity. Content Type: application/json is fine, so long as it's used and paid attention to.
Capabilities
HATEOAS and meta-data... I like to encourage developers to explore what's possible in the API by showing them the actions that they can do on an item, and their permissions. As in... if they can update something, show them how by telling them the URL. The API should ideally not lead a developer to an action that they couldn't perform.
Minor Points
I generally prefer to restrict the possible HTTP status codes and publishing that list so that a developer implementing against the API can build a generic wrapper, know what to expect and take care of handling it. Which means, sometimes I will generalise a response rather than using a status code in one place and one place only.
I'm not so sure on using verbs for special actions like search, 'q' is a well understood part of the 'query string', and I'd make that just an optional parameter on the pluralised collection URL. But... perhaps that's just a bad example.
> 1) You're exposing some internal logic or data structure of the system in the resource
Not following. What internal logic or data structure?
> 2) You're removing information on time zone
This is [almost] always a good thing. Obviously there's personal preference involved here, but I've never seen a system where storing the TZ turned out to be a good idea (not that I've seen all that many systems).
> This is [almost] always a good thing. Obviously there's personal preference involved here, but I've never seen a system where storing the TZ turned out to be a good idea (not that I've seen all that many systems).
Assuming that time is important, perhaps Google Calendar is a good example of timezones and datetimes as strings:
I guess it depends on what you're doing with the data... if it's just "created" then it matters less. But if you're dealing with storing an event in time, it usually occurs in a place that has timezone data... boarding a plane in one timezone and landing in another, a meeting where attendees dial-in from multiple timezones and you need to alert them all at the right time and visually indicate that.
Timezones aren't trivial, and the system might store UTC under the hood, but throwing them out universally and trusting every client to get it right seems less of a good idea than handling them and taking care of that on the API side.
It looks to me like you can set what TZ you want to retrieve dates in when performing queries, with the default being the TZ associated with the calender; this is substantially different than simply returning times with whatever arbitrary TZ was used when creating the record.
Any process that involves date/times and doesn't involve date/times being stored and processed as UTC dates is the wrong solution. As the final step on the presentation layer you can convert them into whatever wacky TZ you want but passing dates back and forward to the server in non-UTC format is fast track to Insanityland
But I would go further and suggest that whether to show those shouldn't just be based on what the API can do, but what the client (and perhaps user controlling the client) can do.
As in... if they can read but not create, then show the "self" link but not the "newcomment" link.
Thus we're not just saying "our API has these endpoints", we're saying "you can do this".
> With REST, the URL for a resource never changes.
I could have an endpoint example.com/users become example.com/subsite/users. It is precisely because the URL might change that you want to give the client URLs to the next possible actions.
First, yes, you could change your "endpoint" URL, but that's not the reason for providing URLs in HATEOAS (just for the record, I hate(oas) that acronym).
You see, REST is different from most other Web services because it doesn't provide access to actions, but to resources. It can be compared to the good old SQL database -- and a resource's URL is the way to uniquely identify that resource, just like in a RDB you would have a combination of a table and a unique identifier.
The reason for providing URLs is to free the consumer from having to construct it by hand each time. In fact, an URL can change -- but not an URL to a single resource (just like you wouldn't arbitrary change a RDB record's ID); rather, a completely different resource can be provided instead.
I misused "action". Replace "to give the client URLs to the next possible actions." with "to give the client URLs so they can perform their next possible actions on related resources".
My point remains: one of the main reasons of using HATEOAS is so that you can evolve your API without breaking clients. The other main benefit is so that your client can discover your protocol by simply following hypermedia controls.
A RESTful API, in its original understanding as defined by Roy Fielding, is based on hypertext (links between resources), content types and traversal: the only "official" URI in the system (the only one which should be documented and immutable) is that of the API's root. For all other API components, the only supported access method is traversing the system until the component is reached.
As a result, there is no requirement either of URI beauty or of URI stability in a RESTful system: clients may cache URIs during traversal to speed-up subsequent operations (think bookmarks) but that's only an optional optimisation and they must know how to re-traverse the system anyway.
Stable and beautiful URIs are "cooler", but they're irrelevant to the system's RESTfulness and a correctly RESTful system (server & client) would allow randomly changing all URIs in the system (but for the root) regularly without functional breakage.
> Consider the following pair of URLS: [/api/v1/posts/1, /api/v2/posts/1] This implies out-of-band information that these two resources are actually the same entity.
No, it doesn't imply that. Now, it would be accurate to say that it would require out-of-band information to know that the two resources are the same entity, but that's not true either. Assuming that the entity returned by a request to an API-specified endpoint has an API-independent GET-able representation, you can specify that in-band in any response that produces the entity using the Content-Location: response header.
> The version number belongs in the `Accept:` field of the HTTP request headers
No, it doesn't. This makes no sense, since Accept defines media types acceptable in the response, not the semantics of the request, whereas different API versions mean different request semantics.
Insofar as multiple representations of returned entities are available, it might make sense to use this style to differentiate which representation version the client wanted, but it doesn't make sense to specify the API version, since that may involve the representation expected by the server on client-sent bodies (for non-GET requests) rather than (or in addition to) changing the format of server-sent bodies, while the Accept: header is only about the latter, not the former.
Is that a standard-compliant media type? `version` doesn't appear to be associated with application/json in the IANA media type registry, and I couldn't find anything definitive about adding arbitrary parameters.
I'm pretty sure parameters are not limited to those provided by IANA (just as media types are not necessarily IANA-registered incidentally) e.g. RFC2616's Accept doc shows a `level` parameter to text/html which is not IANA-specified (and seems entirely unspecified)
Yea I saw the use of level in the example but didn't think to look at the text/html spec to see if it was there. That would seem to imply that you are correct (though it would be nice if there were an explicit answer on this, if only for purely OCD reasons).
> just as media types are not necessarily IANA-registered incidentally
Per RFC4288 I believe they must be if they are in the standard tree (i.e. the name does not contain a '.' or an 'x-' prefix for historical reasons).
I was planning on diving in deeper in my next post. This was more of a quick and practical list of things you should follow for your API not to suck too much.
agree with everything except... unix time, really? Unix time has all sorts of weird potential problems, including year 2038, historical inconsistent handling of leap seconds[1], more.
ISO8601 or bust!
Other than that, yes, I think there are things about REST and REST-like API's that are not clear and subject to debate about best way to do it -- but none of them are mentioned in this post, these are the floor, yep.
What's a RESTful way of authenticating users when you will also have a web app that will use the same API? Should there be just OAuth for API users, and cookies/sessions for web users, or is there a way to make both use the same authentication method?
Because that approach isn't recommended anymore. In short, it's much harder to for a cache to work effectively based upon Accept & Content-Type headers than it is for it to work on the URL.
For example, here's a quote from Roy Fielding: "In general, I avoid content negotiation without redirects like the plague because of its effect on caching.". In other words, content negotiation for a versioned content-type is OK... if it redirects to a URL that contains the version.
httpbis, the IETF working group in charge of maintaining and developing the core HTTP spec, has deprecated content negotiation: http://trac.tools.ietf.org/wg/httpbis/ticket/81; it's still part of a separate spec (RFC2295), but it's not a core component of HTTP anymore. Why? 'HTTP content negotiation was one of those "nice in theory" protocol additions that, in practice, didn't work out.'
Certainly you can still build an API that works fine using specialized content types. Versions in the URL are a pretty practical approach that work very well too.
Having admittedly not descended very deeply into the rules regarding HTTP caching, why should it be harder to cache based on the cross-section of the URL and the Accept-Header and not simply upon the URL?
I am not certain which method I prefer more honestly, and could fall either way. The comment that it was nice in theory but didn't work in practice seems to more point to not following the spec as it was written then a problem with the spec itself.
I would say it's far more difficult if you want to support the full range of capabilities that the Accept header supports. Accept can support more than a single mime-type, and can also include priorities on each mime-type accepted. If you combine that with multiple clients, I think caching based upon that header would be pretty difficult (or you'd have to make some simplifying assumptions).
Impossible? No, not really.
Worthwhile? My opinion is no. I haven't seen the practical benefits of the content-negotiation approach over the URL approach, but I have seen the practical difficulties, so I'm convinced.
> Re: [rest-discuss] Request Header vs. Query String
> On 4/11/06, Roy T. Fielding <fielding@...> wrote:
> In general, I avoid content negotiation without redirects like the
> plague because of its effect on caching.
I prefer to have a version identifier at the beginning of all things that may need to be versioning. This should probably extend to URLs. Perhaps you need two or more version schemas: one for changes that affect object names (URLs) and a separate version system for changes that do not (this version sent in the headers).
You could argue that versions of the API are themselves resources and should be discoverable - the Salesforce REST API does this and while I actually prefer the specialized content-type approach myself it certainly does work.
> Because it's harder to set the content-type from browsers.
That's only when using HTML forms or links, TFA does not even mention the option of multiple representation of resources so it's irrelevant: there's nothing hard to adding an Accept header to an XHR.
I had left them out for a future post as I wanted to focus on quick guidelines to get started. Mea culpa for not mentioning them at least: will rectify in a couple days with a follow up post properly framing this.
I've found that it is very helpful to include some sort of pseudo-unique "Incident number" in error responses which you also log in the server's log along with any exception/error details.
This makes it very easy then to correlate issues reported by users with the actual errors/requests in the log.
Every few months people start an argument about post vs put vs patch and versioning in the url vs some header. I think a lot of developers aren't fully aware of PATCH and making put actually fully replace an object can be dangerous. This probably especially bad if you use some schemaless db. For versioning, I like to support both. Let the user do what makes them happy. Not really that hard to support URL and header...
Isn't a Web site also a Web service? Following REST, should't the Web site and API be unified, with the same resources being consumed by both humans and robots (by varying the representation)?
There is -- hypermedia. That's the practice of driving your API by following hyperlinks, rather than by constructing URLs and requesting them.
There are several schemes of encoding link information in API resources. I like the Hypertext Application Language, or HAL, the best. Check it out: http://stateless.co/hal_specification.html
There are ways to do that, I usually find them cumbersome. There are also ways to "easily" maintain API references (Swagger, I/O Docs). I posted something about why we don't use these tools not too long ago: http://devo.ps/blog/2013/02/07/why-we-dropped-iodocs-and-swa...
POST for update and PUT for create, not the other way around.
EDIT: Wow. A lot of people who think I'm wrong. I agree with you that its not the whole truth, but I think its more true (for most cases) than what the article say.
Both POST and PUT can be used to create and update info, the difference is idempotence. In other words, making the same PUT request over and over again won't change the result beyond the initial request.
For example:
- Repeat "POST /entries" 5 times with the same request body and you'll have 5 new and identical entries on the server.
- Repeat "PUT /entries" 5 times with the same request body and you'll overwrite the set of entries on the server 5 times.
- Repeat "POST /entries/1234" 5 times and you'll have 5 times whatever the server says it will do when you POST on a given entry (eg. if the server keeps a modify count on that entry, it will end up incremented by 5)
- Repeat "PUT /entries/1234" 5 times and you'll overwrite that entry 5 times on the server. The end-result will be exactly the same as if you did the request 1 or 100 times, including any counters that are part of the entry itself (because those counters would be part of that PUT request body, see below).
Also, a PUT request is usually made on a specific, unique resource unless you want to overwrite a complete set. If the id specified in the URL doesn't yet exist, it will be created. The request body includes the complete resource data to be created/overwritten. Think file uploads.
A POST request can either create a resource or update parts of an existing resource based on the parameters given in the request body. When it creates, the server assigns the id and creates the URL of the newly created resource. Whether it creates or updates depends on the URL you make the POST request to identifies a specific item or not.
[EDIT: clarified a few things about PUT on an id vs a set)
PUT for creating and complete updating, PATCH for partial updates.
POST has no useful semantics (more correctly, its semantics is just like those of "do"/"execute" verb) and should be used only if no other verb matches. Or for compatibility reasons ("POST /foo\nX-HTTP-Method-Override: PUT")
> POST has no useful semantics (more correctly, its semantics is just like those of "do"/"execute" verb) and should be used only if no other verb matches.
While POST is often used as a generic "do"/"execute", its actual defined semantics in RFC2616 are for the server to "accept the entity enclosed in the request as a new subordinate of the resource identified by the Request-URI in the Request-Line."
So, aside from fallback uses, its explicitly the correct verb to use for a request that is intended to create a resource where the client doesn't know the identifier of the resource to be created, only its parent. This is particularly likely to be the case anytime the resource to be created is of a kind that will have a server-assigned key that will be part of the URI.
POST has useful semantics, using PUT requires you know where to put it when the general case is you actually don't, POST for create when you don't know where it goes, the response should then redirect to the created location.
stuffihavemade is correct. Also, generally a POST request is made to some kind of "resource creating" URI, which includes the URI for the newly-created resource in it's response. A PUT request is typically made to the URI of an existing resource.
Well, if you know the full object to save I would use PUT. If you expect the server to fill in ID or something similar I would use POST.
But why use PUT for update? Dont you have to post the entire object (all fields) with a PUT? So if you just want to update one field a POST is more appropriate?
No, PUT can be for creating too, if the client already knows the final URL. From the spec:
If the Request-URI does not point to an existing resource, and that URI is capable of being defined as a new resource by the requesting user agent, the origin server can create the resource with that URI. If a new resource is created, the origin server MUST inform the user agent via the 201 (Created) response.
Time
Using UNIX time may be easier for a developer, and I do concede that point, but by using UNIX time you're doing a few things:
1) You're exposing some internal logic or data structure of the system in the resource
2) You're removing information on time zone
3) In some circumstances where greater precision is needed you're precluding the use of nanoseconds
I know that some languages don't naturally handle ISO8601 dates with nanoseconds, but I do think that's the best format.
YYYY-MM-DDTHH24:MI:SS.NNNNNNNNNZ
With that format one can keep a deeper knowledge of timezone, of precision, of human readability (and JSON.org does say that JSON is easy for humans to read).
Content Types
Whilst it may be pragmatic to not plumb the depths of custom content types they should at least be mentioned and made compulsory even if they aren't typed to a detailed level of specificity. Content Type: application/json is fine, so long as it's used and paid attention to.
Capabilities
HATEOAS and meta-data... I like to encourage developers to explore what's possible in the API by showing them the actions that they can do on an item, and their permissions. As in... if they can update something, show them how by telling them the URL. The API should ideally not lead a developer to an action that they couldn't perform.
Minor Points
I generally prefer to restrict the possible HTTP status codes and publishing that list so that a developer implementing against the API can build a generic wrapper, know what to expect and take care of handling it. Which means, sometimes I will generalise a response rather than using a status code in one place and one place only.
I'm not so sure on using verbs for special actions like search, 'q' is a well understood part of the 'query string', and I'd make that just an optional parameter on the pluralised collection URL. But... perhaps that's just a bad example.