Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

See also on the same site: Hand-coded PDF tutorial | https://brendanzagaeski.appspot.com/0005.html

If you need more, the "free" (trade for your email) e-book from Syncfusion PDF Succinctly demonstrates manipulation barely one level of abstraction higher (not calculating any offsets manually): https://www.syncfusion.com/resources/techportal/details/eboo...

"With the help of a utility program called pdftk[1] from PDF Labs, we’ll build a PDF document from scratch, learning how to position elements, select fonts, draw vector graphics, and create interactive tables of contents along the way."

[1] https://www.pdflabs.com/tools/pdftk-server/



Alas, pdftk is mostly dead. It's written in the ersatz dialect of C that only GCJ could compile, and GCJ is officially gone.

The underlying library is just fine, but a decent front end is lacking.


What is that? Googling ersatz gcj gives your comment as the only relevant result. Also GCJ is/was a Java compiler as far as I am aware.


pdftk is a sort-of-C++ program that contains code like:

  #include "pdftk/com/lowagie/text/pdf/PdfReader.h"
  ...
                  if( input_pdf_p->m_password.empty() ) {
                        reader= new itext::PdfReader( JvNewStringUTF( input_pdf_p->m_filename.c_str() ) );
PdfReader is actually java/pdftk/com/lowagie/text/pdf/PdfReader.java in the pdftk source distribution. Yes, this is a C++ program that's instantiating a Java class. As far as I can tell, what's actually going on is that all the Java code is compiled to C++-ABI-compatible .o files using GCJ and pdftk.cc links against them, giving a native program that is nonetheless mostly written in Java. Yikes!

Perhaps unsurprisingly, GCJ didn't get a huge amount of traction, and it has been deleted from the GCC tree entirely. Good riddance, maybe, but it makes it rather difficult to compile pdftk.


Someone should port all that C++ code (there's not that much of it, really) to Java, that would make it a lot easier to compile, wouldn't it?


>ersatz >ˈəːsats,ˈɛːsats/ >adjective >(of a product) made or used as a substitute, typically an inferior one, for something else.


>If you need more, the "free" (trade for your email)

Just a note that Google lets you get a new email address that isn't spammable, through security through obscurity:

-> You can use your gmail address, add a + after it, and add a keyword. So if you are jsmith@gmail.com you can give out jsmith+syncfusionpdfsuccintly@gmail.com and then later if that starts getting spammed you can redirect it.

NOTE:

This is an incorrect solution (Google, please fix this) because anyone can run a regex removing the + part.

Instead the correct solution is that if you have gmail open, in a single click you should be able to generate a high-entropy gmail address (that does not deplete the namespace) and link it on your end with "syncfusionpdfsuccintly".

If I already have gmail open, 7 seconds to create a new gmail address as follows:

  1.  Click something to start the process

  2.  Type "syncfusionpdfsuccintly" to tag it on my end

  3.  Click something to copy a resulting high-entropy gmail name into the clipboard.
I should then be able to paste it into a form, get it delivered straight into my inbox (never spam), and redirect it to spam if it starts getting spammed.

This would allow people to contact us without ever getting into spam, while entirely removing their ability to contact us if this email address starts getting spammed. There are no downsides.

I believe Google's engineers are smart enough to move from security through obscurity (relying on the knowledge that no spammer can ever invent and run the exact regex s/\+[.]+@/@/g to remove the security through obscurity, as this would entirely break this security, exposing the underlying "protected" email addresses) to something that works.

Until that day comes you can rely on the security through obscurity to give out a secure email address that can't be spammed. Just add a + and a tag!

Please.

Google: I believe you are smart enough to understand this comment and implement this solution, which can be prototyped in 30 minutes and solves the spam problem forever. You can do it! I believe in you. You're 99.999% there and your security through obscurity works very well for me. I use it.

I hope you will go above and beyond and solve the remaining 0.001%. It would just make me feel better to know that a 13-character regex couldn't defeat your solution.


Couldn't you do something like:

1. Register foo@gmail.com 2. Give out your email address to friends and family as foo+bar@gmail.com 3. Give out your email address to services as foo+{service name}@gmail.com 4. Reject anything coming directly to foo@gmail.com


That is a different kind of obscurity, as if that were my protocol a spammer could make up a keyword and it would be delivered until I realized that it wasn't myself who made it up.

Maybe there is a way to whitelist keywords and only deliver tags I add one at a time via filters, but it is not the usual interface.


Many spammers have caught onto the foo+bar trick. When they detect such an address they will spam both foo+bar@gmail.com and foo@gmail.com.


Read his message carefully. They don't know about foo+bar@gmail.com


I wrote something in node.js that does exactly this. But I have only used it for personal use. I'm honestly surprised that nobody has done this already.

Right now it just silently drops expired addresses. But it is so satisfying to think about bouncing (but stuff like bouncing behavior is something you have to consider when running a mail service).

I was thinking of turning it into a service, but I'd have to read up on how to scale it. Running an SMTP server takes a lot of care. I've found that just using nearlyfreespeech.net's mail forwarding is most reliable to receive emails. So I do that for now, since it is on a small scale.

I just got so frustrated with how this problem has such an obvious technical solution. At least for us users, anyway. It's not a solution for the marketers.

I strongly suspect that Google is very reluctant to do anything to make the email landscape unstable. I think that if Google started offering this, it would shake up so much of their business.


Bouncing email doesn't make much sense if you can reject easily (which would be your case).

Running an inbound SMTP server is much easier than running outbound smtp for laypeople. The software (postfix, exim, etc.) is rock solid (you have to REALLY mess up to lose emails) and the protocol is very forgiving (all serious senders have good retry policies). I encourage you to try!


Actually, I'm gonna try https://grr.la/ryo/


As I mentioned, Google already offers this. Just add a tag with + to your existing gmail address and you instantly create a tagged new one.


Except as pointed out, a spammer (etc) can remove the tag and get the original address using a regex.


only until Google fixes it as suggested. they have some smart people there and I believe in them!


> Instead the correct solution is that if you have gmail open, in a single click you should be able to generate a high-entropy gmail address (that does not deplete the namespace) and link it on your end with "syncfusionpdfsuccintly".

This a reasonably priced paid service that does already exist, it's not from Google and no, I won't name it neither publicly nor privately because I used it for years and I don't want their domain to be banned by those requiring your email address for everything. If you know how to search you will probably find it. Basically you sign with them using one valid email address, then on their interface you can create as many addresses as you need (IIRC there is a limit but I used even dozens at a time without problems) and add a keyword to them. All of those addresses will be redirected to the email you signed with, but the From: field will also contain the keyword you specified so that if you create an address for each service you sign up for, you instantly recognize who is spamming you when they use that address. This is very effective and I filtered out a lot of spammers. I'm surprised there are no more services like this one around, or probably there are many but they keep a low profile to avoid being banned. That's why I'm not going to name that service, sorry. But it does exist indeed and is technically easy to implement.


you're afraid of naming the service but if Google implemented my suggestion they could never be blacklisted. (Unless their high-entropy email tags followed some easily identified pattern.)

I'm not asking for the service to "exist". I'm asking Google to take twenty minutes and fix their solution, which already works but is security through obscurity.


> if Google implemented my suggestion they could never be blacklisted

Google seems to have built their brand intentionally to be the opposite of what you're asking for though; and absolutely they could be blacklisted with a simple "GMAIL ADDRESSES NO LONGER ACCEPTED HERE".

>which already works but is security through obscurity.

I'm not sure which one you are saying is security through obscurity here... blah+real.id@gmail.com... or the high entropy mkKAjgsdf788hf87hf@gmail.com, both are obscure, but its a stretch of imagination to start labelling this a security issue.


> > if Google implemented my suggestion they could never be blacklisted

> Google seems to have built their brand intentionally to be the opposite of what you're asking for though; and absolutely they could be blacklisted with a simple "GMAIL ADDRESSES NO LONGER ACCEPTED HERE".

I think "you can't block GMail" here is meant in the sense that "you can't block the Google crawler". It's certainly technically trivial to do so, but the opportunity cost from lost users will be, for most businesses, unacceptably high.


>I think "you can't block GMail" here is meant in the sense that "you can't block the Google crawler". It's certainly technically trivial to do so, but the opportunity cost from lost users will be, for most businesses, unacceptably high.

Excellent interpretation. Gmail = Google crawler. I've made a note of this now.

What needs to happen next is a deep discussion between yourself and logicallee, in the context of Google crawler as well as how to make gmail come further out of the dark ages with high entropy and no security obscurity.


it's not blah+real.id@gmail.com - it's real.id+blah@gmail.com which currently gets delivered to real.id@gmail.com with a tag of "blah". However this tag can be removed by spammers, hiding where they got my email address.

mkKAjgsdf788hf87hf is not the only possible high-entropy format, it could be if the type that gfycat uses such as "uncommongrimyladybug". That is quite hard to blacklist.

Nobody is ever going to stop accepting gmail addresses, that suggestion is pretty ridiculous. Especially since I suggest that these addresses should be delivered straight to your real inbox (unless they start getting spammed). There's no reason people should stop accepting them.


Google mostly solves this with spam filters instead.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: