I’ve noticed some files I opened in a text editor have all kinds of crazy unrenderable chars

  • cheese_greater@lemmy.worldOP
    link
    fedilink
    arrow-up
    2
    ·
    edit-2
    2 months ago

    I just mean like any file (pdf, jpeg, mp4, mp3, exe—

    mp4/mp3 most famously for me

    I find it so damn cool and incredible I can record something/anything right now and open the audio in a text file and its all right there—albeit in an incomprehensible format but there altogether.

    Its like a thinking rock etching sound into stone

    • Admiral Patrick@dubvee.org
      link
      fedilink
      English
      arrow-up
      8
      ·
      edit-2
      2 months ago

      If you’re on Linux, you can convert that to something more human readable by piping it to base64. It works with any file, but I’ll use an image here:

      cat image.webp | base64

      Which yields:

      UklGRroEAABXRUJQVlA4WAoAAAAgAAAAYwAAQgAASUNDUKACAAAAAAKgbGNtcwRAAABtbnRyUkdC
      IFhZWiAH6AAIABoADgAJACBhY3NwQVBQTAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA9tYAAQAA
      AADTLWxjbXMAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA1k
      ZXNjAAABIAAAAEBjcHJ0AAABYAAAADZ3dHB0AAABmAAAABRjaGFkAAABrAAAACxyWFlaAAAB2AAA
      ABRiWFlaAAAB7AAAABRnWFlaAAACAAAAABRyVFJDAAACFAAAACBnVFJDAAACFAAAACBiVFJDAAAC
      FAAAACBjaHJtAAACNAAAACRkbW5kAAACWAAAACRkbWRkAAACfAAAACRtbHVjAAAAAAAAAAEAAAAM
      ZW5VUwAAACQAAAAcAEcASQBNAFAAIABiAHUAaQBsAHQALQBpAG4AIABzAFIARwBCbWx1YwAAAAAA
      AAABAAAADGVuVVMAAAAaAAAAHABQAHUAYgBsAGkAYwAgAEQAbwBtAGEAaQBuAABYWVogAAAAAAAA
      9tYAAQAAAADTLXNmMzIAAAAAAAEMQgAABd7///MlAAAHkwAA/ZD///uh///9ogAAA9wAAMBuWFla
      IAAAAAAAAG+gAAA49QAAA5BYWVogAAAAAAAAJJ8AAA+EAAC2xFhZWiAAAAAAAABilwAAt4cAABjZ
      cGFyYQAAAAAAAwAAAAJmZgAA8qcAAA1ZAAAT0AAACltjaHJtAAAAAAADAAAAAKPXAABUfAAATM0A
      AJmaAAAmZwAAD1xtbHVjAAAAAAAAAAEAAAAMZW5VUwAAAAgAAAAcAEcASQBNAFBtbHVjAAAAAAAA
      AAEAAAAMZW5VUwAAAAgAAAAcAHMAUgBHAEJWUDgg9AEAALAQAJ0BKmQAQwA+8WSmTqmlKCYvmWqp
      MB4JZQDLnNaF2NMD2L3xQGb5nmLiGhGWxQuD8kwUSXF0u2UTgX0YrR3MY2SsRCNEQ8hZ6WkCUTih
      LdmsElHZVzoMwO/fj4X/ZSNT2R9qgxwqgEed891j4KCNRLK/tUbG3hZ3Mw2kixguSFIEcAgBtv8w
      eAu0PwAA/upMzBqq+dcN8viO7FpqpV6GvPcRILm+HsOQblnpHx03lASjGlSyGbkKUD3xA5KOqgq/
      VEUJ4qF9VoAYFbFhQRAgkvmREk5umMj8sr9Np95+n/oP2Aq2VW5xU4F1xpD8Vd4Dp7Phwm9w/Dnf
      94djRROFRYPZeg/1Q/qiROFRVRu2nBcgndbhc0x0h+kgvT/naeJOEqwNjYPlIiw/DGuxav7+x09R
      mf2mJto3ineDqfyMWUN83PmKqzGHkYGhZrTU478qjlQucDzWkwobnUmzhE6I+mDYkfiUVPcHyXbf
      xXRStyPiPZAkJZrE9OrjFNUeljRQdVTQqeBsy+O9VwDLU5GcKhBQHa4cj+/DGqUhi74WH0EuHsb3
      EgZVNc1FbRm5QFOpjDSprGIRYxe6sFFDrDOg4DhWZRnOa7s68pGaDDpbqrORxzPHXPbs55/1HTas
      DDGzKFmTG4hJ2GUZKqjPcQ+MAAAA
      

      Copy that into a text file and pass it to base64 with the decode flag, and you’ll get the original binary:

      cat data.txt | base64 -d > data.bin

      Inspect it to see what kind of file it is:

      file data.bin -> data.bin: RIFF (little-endian) data, Web/P image

      Rename it so you can just double-click it to open it:

      mv data.bin data.webp

      Enjoy the surprise.

      You can also print files like that, scan them using OCR, and then restore them. A very inefficient way to do backups, but it works.

      • cheese_greater@lemmy.worldOP
        link
        fedilink
        arrow-up
        3
        ·
        2 months ago

        How is it representing it tho? Like does it have woven in there an array of hexcode colors for every microscopic pixel that makea up the picture.

        Are images and audio files just arrays of frames which are arrays of pixels and sound units?

        • Admiral Patrick@dubvee.org
          link
          fedilink
          English
          arrow-up
          4
          ·
          edit-2
          2 months ago

          It just converts the raw binary data into character encoding, so it doesn’t matter what the source is (image, video, database file, etc). The source binary data is taken 6 bits at a time, then this group of 6 bits is mapped to one of 64 unique characters.

          The decoding process is just the reverse of that: mapping the data back to binary form.

          https://en.wikipedia.org/wiki/Base64

        • Num10ck@lemmy.world
          link
          fedilink
          English
          arrow-up
          2
          ·
          1 month ago

          the answer to your how question is as needed.

          some image and audio formats (especially older ones) are like that, yes. others use compression or other techniques to suit their need. like a sound can be a raw recording sample. or a sound can be described with Attack/Decay/Sustain/Release, along with octave and note etc. so a MIDI file is an audio file format without samples.

          i once created an image format to be used for spiraling out images, instead of pixel arrays they were concentric circles of pixels that i could easily offset.